| Version | Date | Description |
|---|---|---|
| 0.5 | 2005-09-26 | Ease-of-use release |
| 0.4 | 2005-08-10 | Four new features |
| 0.3 | 2004-06-23 | Biggest change is POST ability |
| 0.2 | 2004-04-13 | OSCube + Example |
| 0.1 | 2003-12-23 | Initial release |
| Type | Changes | By |
|---|---|---|
| Manual written to make it easier to start using Scraping-Engine | hen |
| Template structure created to make it easier to start using Scraping-Engine | hen |
| Catch RuntimeExceptions in the ScrapingRunner so that a problem in one scraper does not stop the others from running | hen |
| Basic Authenticationon now supported for http://, and not just https:// | hen |
| Updated to oscube 0.3, scheduling configuration options are affected; to simple-jndi 0.11, which will mean changes to your sj .properties files; and to gj-core 3.1, which will get rid of some pesky warnings | hen |
| Type | Changes | By |
|---|---|---|
| Caches HttpClient object for a scraper, thereby supporting cookies Fixes SCB-17. | hen |
| Ability to set any HTTP Header for initial request. If a User-Agent header is specified, Scraping-Engine will use this to look for a robots.txt Fixes SCB-18. | hen |
| ftp:// urls now supported in much the same way as http/https Fixes SCB-21. | hen |
| Fetchers are now pluggable Fixes SCB-24. | hen |
| Type | Changes | By |
|---|---|---|
| Move the example UrlScraper into the main distro so everyone may use it Fixes SCB-12. | hen |
| Add support for POST requests Fixes SCB-16. | hen |
| Create .bat files for the example Fixes SCB-14. | hen |
| Type | Changes | By |
|---|---|---|
| Infinite loops on africainsight.org Fixes SCB-2. | hen |
| Needs to respect robots.txt before being unleashed fully Fixes SCB-3. | hen |
| Add the PrinterStore, a store that prints out Fixes SCB-5. | hen |
| Add a FileStore concept so that scraped data may be saved Fixes SCB-7. | hen |