Crawler Python API¶

Getting started with Crawler is easy. The main class you need to care about is Crawler

crawler.main¶

Main Module

class crawler.main.Crawler(url, delay, ignore)¶

Main Crawler object.

Example:

c = Crawler('http://example.com')
c.crawl()

Parameters:	delay – Number of seconds to wait between searches ignore – Paths to ignore

crawl()¶

Crawl the URL set up in the crawler.

This is the main entry point, and will block while it runs.

get(url)¶

Get a specific URL, log its response, and return its content.

Parameters:	url – The fully qualified URL to retrieve

crawler.main.run_main()¶: A small wrapper that is used for running as a CLI Script.