[ 
https://issues.apache.org/jira/browse/NUTCH-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated NUTCH-2926:
----------------------------------------
    Description: 
The Nutch webserver caches resources (seed lists, configuration, jobs, etc.) 
in-memory. This is not a reliable or resilient solution for users who want a 
persistent Nutch server service for the enterprise. 
I therefore propose to add a persistence mechanism which will address this 
problem.
I intend to use JOOQ as a thin layer on top of JDBC. This will provide 
flexibility for deploying a wide variety of RDBMS backends.
h2 is a popular, very lightweight (~2.5 MB) appropriately-licensed solution we 
could use as the initial backend. I intend to use it in embedded mode with 
enabled persistence so we'll have data on the disk. This means that if we stop 
Nutch server we can restart and restore server resources from disk.

Some resources
https://www.jooq.org/
https://h2database.com/html/download.html
http://www.h2database.com/html/tutorial.html#using_jooq


  was:
The Nutch webserver caches resources (seed lists, configuration, jobs, etc.) 
in-memory. This is not a reliable or resilient solution for the longer term use 
of Nutch server in the enterprise. 
I propose to add a persistence layer which will address this problem.
I intend to use JOOQ as a thin layer on top of JDBC. This will provide 
flexibility for deploying a wide variety of RDBMS backends.
h2 is a popular, very lightweight (~2.5 MB) appropriately-licensed solution we 
could use as the initial backend. I intend to use it in embedded mode with 
enabled persistence so we'll have data on the disk. This means that if we stop 
Nutch server we can restart and restore server resources from disk.

Some resources
https://h2database.com/html/download.html
http://www.h2database.com/html/tutorial.html#using_jooq



> Implement persistent storage for Nutch Webserver resources
> ----------------------------------------------------------
>
>                 Key: NUTCH-2926
>                 URL: https://issues.apache.org/jira/browse/NUTCH-2926
>             Project: Nutch
>          Issue Type: Improvement
>          Components: nutch server, storage
>            Reporter: Lewis John McGibbney
>            Assignee: Lewis John McGibbney
>            Priority: Major
>
> The Nutch webserver caches resources (seed lists, configuration, jobs, etc.) 
> in-memory. This is not a reliable or resilient solution for users who want a 
> persistent Nutch server service for the enterprise. 
> I therefore propose to add a persistence mechanism which will address this 
> problem.
> I intend to use JOOQ as a thin layer on top of JDBC. This will provide 
> flexibility for deploying a wide variety of RDBMS backends.
> h2 is a popular, very lightweight (~2.5 MB) appropriately-licensed solution 
> we could use as the initial backend. I intend to use it in embedded mode with 
> enabled persistence so we'll have data on the disk. This means that if we 
> stop Nutch server we can restart and restore server resources from disk.
> Some resources
> https://www.jooq.org/
> https://h2database.com/html/download.html
> http://www.h2database.com/html/tutorial.html#using_jooq



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to