1242. Web Crawler Multithreaded
Given a URL startUrl and an interface HtmlParser, implement a Multi-threaded web crawler to crawl all links that are under the same hostname as startUrl. Return all URLs obtained by your web crawler in any order. Your crawler should: Start from the page: startUrl Call HtmlParser.getUrls(url) to get all URLs from a webpage of a given URL. Do not crawl the same link twice. Explore only the links that are under the same hostname as startUrl. As shown in the example URL above, the hostname is example.org. For simplicity’s sake, you may assume all URLs use HTTP protocol without any port specified. For example, the URLs http://leetcode.com/problems and http://leetcode.com/contest are under the same hostname, while URLs http://example.org/test and http://example.com/abc are not under the same hostname. ...