Derby

2010-06-03 Thread karl.wright
For what it's worth, after some 5 days of work, and a couple of schema changes 
to boot, LCF now runs with Derby.
Some caveats:

(1) You can't run more than one LCF process at a time.  That means you need 
to either run the daemon or the crawler-ui web application, but you can't run 
both at the same time.
(2) I haven't tested every query, so I'm sure there are probably some that 
are still broken.
(3) It's slow.  Count yourself as fortunate if it runs 1/5 the rate of 
Postgresql for you.
(4) Transactional integrity hasn't been evaluated.
(5) Deadlock detection and unique constraint violation detection is 
probably not right, because I'd need to cause these errors to occur before 
being able to key off their exception messages.
(6) I had to turn off the ability to sort on certain columns in the reports 
- basically, any column that was represented as a large character field.

Nevertheless, this represents an important milestone on the path to being able 
to write some kind of unit tests that have at least some meaning.

If you have an existing LCF Postgresql database, you will need to force an 
upgrade after going to the new trunk code.  To do this, repeat the 
org.apache.lcf.agents.Install command, and the 
org.apache.lcf.agents.Register org.apache.lcf.crawler.system.CrawlerAgent 
command after deploying the new code.  And, please, let me know of any kind of 
errors you notice that could be related to the schema change.

Thanks,
Karl




RE: Derby

2010-06-03 Thread karl.wright
The daemon does not need to interact with the UI directly, only with the 
database.  So, you stop the UI, start the daemon, and after a while, shut down 
the daemon and restart the UI.

Karl

-Original Message-
From: ext Jack Krupansky [mailto:jack.krupan...@lucidimagination.com] 
Sent: Thursday, June 03, 2010 5:51 PM
To: connectors-dev@incubator.apache.org
Subject: Re: Derby

 (1) You can't run more than one LCF process at a time.  That means you 
 need to either run the daemon or the crawler-ui web application, but you 
 can't run both at the same time.

How do you Start a crawl then if not in the web app which then starts the 
agent process crawling?

Thanks for all of this effort!

-- Jack Krupansky

--
From: karl.wri...@nokia.com
Sent: Thursday, June 03, 2010 5:34 PM
To: connectors-dev@incubator.apache.org
Subject: Derby

 For what it's worth, after some 5 days of work, and a couple of schema 
 changes to boot, LCF now runs with Derby.
 Some caveats:

 (1) You can't run more than one LCF process at a time.  That means you 
 need to either run the daemon or the crawler-ui web application, but you 
 can't run both at the same time.
 (2) I haven't tested every query, so I'm sure there are probably some 
 that are still broken.
 (3) It's slow.  Count yourself as fortunate if it runs 1/5 the rate of 
 Postgresql for you.
 (4) Transactional integrity hasn't been evaluated.
 (5) Deadlock detection and unique constraint violation detection is 
 probably not right, because I'd need to cause these errors to occur before 
 being able to key off their exception messages.
 (6) I had to turn off the ability to sort on certain columns in the 
 reports - basically, any column that was represented as a large character 
 field.

 Nevertheless, this represents an important milestone on the path to being 
 able to write some kind of unit tests that have at least some meaning.

 If you have an existing LCF Postgresql database, you will need to force an 
 upgrade after going to the new trunk code.  To do this, repeat the 
 org.apache.lcf.agents.Install command, and the 
 org.apache.lcf.agents.Register 
 org.apache.lcf.crawler.system.CrawlerAgent command after deploying the 
 new code.  And, please, let me know of any kind of errors you notice that 
 could be related to the schema change.

 Thanks,
 Karl


 


Re: Derby

2010-06-03 Thread Jack Krupansky

Just to be clear, the full sequence would be:

1) Start UI app. Agent process should not be running.
2) Start LCF job in UI.
3) Shutdown UI app. Not just close the browser window.
4) AgentRun.
5) Wait long enough for crawl to have finished. Maybe watch to see that Solr 
has become idle.

6) Possibly commit to Solr.
7) AgentStop.
8) Back to step 1 for additional jobs.

Correct?

-- Jack Krupansky

--
From: karl.wri...@nokia.com
Sent: Thursday, June 03, 2010 7:24 PM
To: connectors-dev@incubator.apache.org
Subject: RE: Derby

The daemon does not need to interact with the UI directly, only with the 
database.  So, you stop the UI, start the daemon, and after a while, shut 
down the daemon and restart the UI.


Karl

-Original Message-
From: ext Jack Krupansky [mailto:jack.krupan...@lucidimagination.com]
Sent: Thursday, June 03, 2010 5:51 PM
To: connectors-dev@incubator.apache.org
Subject: Re: Derby

(1) You can't run more than one LCF process at a time.  That means 
you

need to either run the daemon or the crawler-ui web application, but you
can't run both at the same time.


How do you Start a crawl then if not in the web app which then starts 
the

agent process crawling?

Thanks for all of this effort!

-- Jack Krupansky

--
From: karl.wri...@nokia.com
Sent: Thursday, June 03, 2010 5:34 PM
To: connectors-dev@incubator.apache.org
Subject: Derby


For what it's worth, after some 5 days of work, and a couple of schema
changes to boot, LCF now runs with Derby.
Some caveats:

(1) You can't run more than one LCF process at a time.  That means 
you

need to either run the daemon or the crawler-ui web application, but you
can't run both at the same time.
(2) I haven't tested every query, so I'm sure there are probably some
that are still broken.
(3) It's slow.  Count yourself as fortunate if it runs 1/5 the rate 
of

Postgresql for you.
(4) Transactional integrity hasn't been evaluated.
(5) Deadlock detection and unique constraint violation detection is
probably not right, because I'd need to cause these errors to occur 
before

being able to key off their exception messages.
(6) I had to turn off the ability to sort on certain columns in the
reports - basically, any column that was represented as a large character
field.

Nevertheless, this represents an important milestone on the path to being
able to write some kind of unit tests that have at least some meaning.

If you have an existing LCF Postgresql database, you will need to force 
an

upgrade after going to the new trunk code.  To do this, repeat the
org.apache.lcf.agents.Install command, and the
org.apache.lcf.agents.Register
org.apache.lcf.crawler.system.CrawlerAgent command after deploying the
new code.  And, please, let me know of any kind of errors you notice that
could be related to the schema change.

Thanks,
Karl





Re: Derby

2010-06-03 Thread Jack Krupansky
What is the nature of the single LCF process issue? Is it because the 
database is being used in single-user mode, or some other issue? Is it a 
permanent issue, or is there a solution or workaround anticipated at some 
stage.


Thanks.

-- Jack Krupansky

--
From: karl.wri...@nokia.com
Sent: Thursday, June 03, 2010 5:34 PM
To: connectors-dev@incubator.apache.org
Subject: Derby

For what it's worth, after some 5 days of work, and a couple of schema 
changes to boot, LCF now runs with Derby.

Some caveats:

(1) You can't run more than one LCF process at a time.  That means you 
need to either run the daemon or the crawler-ui web application, but you 
can't run both at the same time.
(2) I haven't tested every query, so I'm sure there are probably some 
that are still broken.
(3) It's slow.  Count yourself as fortunate if it runs 1/5 the rate of 
Postgresql for you.

(4) Transactional integrity hasn't been evaluated.
(5) Deadlock detection and unique constraint violation detection is 
probably not right, because I'd need to cause these errors to occur before 
being able to key off their exception messages.
(6) I had to turn off the ability to sort on certain columns in the 
reports - basically, any column that was represented as a large character 
field.


Nevertheless, this represents an important milestone on the path to being 
able to write some kind of unit tests that have at least some meaning.


If you have an existing LCF Postgresql database, you will need to force an 
upgrade after going to the new trunk code.  To do this, repeat the 
org.apache.lcf.agents.Install command, and the 
org.apache.lcf.agents.Register 
org.apache.lcf.crawler.system.CrawlerAgent command after deploying the 
new code.  And, please, let me know of any kind of errors you notice that 
could be related to the schema change.


Thanks,
Karl