simple scraping script
Hello, I need a simple script: step 0: you get a file with a list of URLs (hundreds or thousands); they are in all sorts of format (subdomains, https, many SLD/TLD). step 1: you extract the domain names from the URLs and generate a sorted list of unique domains; this is not as simple as it sounds as the function doing that must be able to tokenize any URL format as well as any form of TLD (like .org.nz, .fr, .co.cr, ... for example). step 2: clean the list to remove some domains such as free blogs or .gov. step 3: scrape namecheap.com to get one data about some of the domains. step 4: scrape domaintools.com to get some data for a short list of domains (without getting banned for superusage). step 5: scrape 2 data from the alexa.com page for each domain in the list. step 6: sort the list and output as a flat file. Potential for long term work with the right programmer(s) Keywords: MySQL, PHP, HTML
| Expired |
Carlos Santos
3D Modeling Designer
View profile
Doug Brown
Graphic Designer
View profile
More html projects
View AllMore mysql projects
View AllRelated projects
Search for freelance jobscan’t wait for more clients
and advertising. Thank you."