Re: Jackrabbit, the database

2007-08-21 Thread Dominique Pfister
Hi Marcel, On 21/08/07, Marcel May <[EMAIL PROTECTED]> wrote: > Jackrabbit must support JTA if it wants to support TXs according to the > JCR Spec > (see previous discussion, > http://www.mail-archive.com/dev@jackrabbit.apache.org/msg06525.html). > At the moment, this is a spec violation IMO: JR s

[jira] Updated: (JCR-1073) Add getTotalSize() to QueryResults

2007-08-21 Thread Christoph Kiehl (JIRA)
[ https://issues.apache.org/jira/browse/JCR-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christoph Kiehl updated JCR-1073: - Attachment: patch.txt A first shot how this might look like. I stumbled over QueryResultImpl which

[jira] Created: (JCR-1073) Add getTotalSize() to QueryResults

2007-08-21 Thread Christoph Kiehl (JIRA)
Add getTotalSize() to QueryResults -- Key: JCR-1073 URL: https://issues.apache.org/jira/browse/JCR-1073 Project: Jackrabbit Issue Type: New Feature Components: query Reporter: Christoph K

[jira] Updated: (JCR-1066) Exclude system index for queries that restrict the result set to nodetypes not availble in the "jcr:system" subtree

2007-08-21 Thread Christoph Kiehl (JIRA)
[ https://issues.apache.org/jira/browse/JCR-1066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christoph Kiehl updated JCR-1066: - Attachment: patch_with_extendible_system_nodetypes.txt I'm not really happy with all those solution

Re: improving the scalability in searching

2007-08-21 Thread Christoph Kiehl
Ard Schrijvers wrote: Wellif I start implementing the 1:1 mapping simultaneously and have to make this backwards compatible also for old indices, I am afraid that I have to test all around the place. As far as I can see the following needs to be done: In LuceneQueryBuilder methods need t

Re: improving the scalability in searching

2007-08-21 Thread Christoph Kiehl
Ard Schrijvers wrote: Christoph Kiehl wrote: 4. Regarding sorting: We will still need our own sorting because we cache the document order per subreader whereas lucenes sorting only caches per reader which get invalidated after every write operation. But the initial cache creation will be faster.

RE: Re: improving the scalability in searching

2007-08-21 Thread Ard Schrijvers
> > Marcel Reutegger wrote: > > Christoph Kiehl wrote: > >> Marcel Reutegger wrote: > >> > >>> 1) New QueryHandler class > >>> 2) Introduce parameter in configuration > >>> 3) Auto-detect in SearchIndex > > [...] > > > ok, I'm at a point where I think we should implement 3). I > don't have >

RE: Re: improving the scalability in searching

2007-08-21 Thread Ard Schrijvers
> Christoph Kiehl wrote: > In general I think it's a good idea to have a 1:1 mapping of > properties to > lucene fields. It's just more natural and easier to > understand as you said. > > Performance wise I'm not sure if it will gain you "lots of > performance". I just > had a quick look at

[jira] Resolved: (JCR-1068) NamespaceRegistryTest.testRegisterNamespace test assumptions

2007-08-21 Thread Julian Reschke (JIRA)
[ https://issues.apache.org/jira/browse/JCR-1068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julian Reschke resolved JCR-1068. - Resolution: Fixed Fixed with revision 568259. > NamespaceRegistryTest.testRegisterNamespace test

[jira] Commented: (JCR-1068) NamespaceRegistryTest.testRegisterNamespace test assumptions

2007-08-21 Thread Julian Reschke (JIRA)
[ https://issues.apache.org/jira/browse/JCR-1068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12521570 ] Julian Reschke commented on JCR-1068: - Modified proposal: when creating a child node fails, try a property instead.

[jira] Commented: (JCR-701) Upgrade to Xerces 2.8.1

2007-08-21 Thread Bertrand Delacretaz (JIRA)
[ https://issues.apache.org/jira/browse/JCR-701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12521568 ] Bertrand Delacretaz commented on JCR-701: - The Xerces dependency might still be needed for JCR-19 > Upgrade to

[jira] Assigned: (JCR-1068) NamespaceRegistryTest.testRegisterNamespace test assumptions

2007-08-21 Thread Julian Reschke (JIRA)
[ https://issues.apache.org/jira/browse/JCR-1068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julian Reschke reassigned JCR-1068: --- Assignee: Julian Reschke > NamespaceRegistryTest.testRegisterNamespace test assumptions > -

[jira] Commented: (JCR-701) Upgrade to Xerces 2.8.1

2007-08-21 Thread Jukka Zitting (JIRA)
[ https://issues.apache.org/jira/browse/JCR-701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12521564 ] Jukka Zitting commented on JCR-701: --- > I'm not really sure why we have a dependency on Xerces at all. The dependency

Re: improving the scalability in searching

2007-08-21 Thread Christoph Kiehl
Marcel Reutegger wrote: Christoph Kiehl wrote: Marcel Reutegger wrote: 1) New QueryHandler class 2) Introduce parameter in configuration 3) Auto-detect in SearchIndex [...] ok, I'm at a point where I think we should implement 3). I don't have fundamental opposition against it and you think

[jira] Commented: (JCR-701) Upgrade to Xerces 2.8.1

2007-08-21 Thread Julian Reschke (JIRA)
[ https://issues.apache.org/jira/browse/JCR-701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12521562 ] Julian Reschke commented on JCR-701: Hm. I'm not really sure why we have a dependency on Xerces at all. The JDKs sh

Re: improving the scalability in searching

2007-08-21 Thread Christoph Kiehl
Ard Schrijvers wrote: So, WDOT about indexing properties in seperate lucene Fields, and about possibly indexing more information of one property. My experience with lucene, is that indexing tactically, eases querying a lot, and gains you lots of performance. So, if you do agree on these changes,

Re: Jackrabbit, the database

2007-08-21 Thread Marcel May
Padraic Hannon wrote: > I concur, relational semantics should be buried within the persistence > managers. However, I think that one can still delegate transaction > handling using JTA to the container rather than using synchronization > and connection.autocommit(false). Jackrabbit must support JT

Re: Jackrabbit, the database

2007-08-21 Thread Padraic Hannon
I concur, relational semantics should be buried within the persistence managers. However, I think that one can still delegate transaction handling using JTA to the container rather than using synchronization and connection.autocommit(false). Obviously this should be configurable. Regardless of

Re: Multiple connections (Was: Jackrabbit, the database)

2007-08-21 Thread Padraic Hannon
Weblogic and oracle do something similair as well. Also, the PreparedStatement caches in those pools are configurable. Once I have the chance, hopefully this week, I'll get some testing done with the code I modified. Also, Derby has a pooled connection system as well. -pih Thomas Mueller wrote

[jira] Created: (JCR-1072) SPI-commons: QValueTest.testDateValueEquality2 fails due to changes made with JCR-1018

2007-08-21 Thread angela (JIRA)
SPI-commons: QValueTest.testDateValueEquality2 fails due to changes made with JCR-1018 --- Key: JCR-1072 URL: https://issues.apache.org/jira/browse/JCR-1072 Project:

Re: Multiple connections (Was: Jackrabbit, the database)

2007-08-21 Thread Thomas Mueller
Hi, > returns a prepared statement back to a pool (instead of really closing it) You are right, that's great! Then we should test if the PooledConnectionPersistenceManager really does improve the performance, and in what cases. Thomas

[jira] Created: (JCR-1071) PROPPATCH on collection gets 403 Forbidden

2007-08-21 Thread Rob Owen (JIRA)
PROPPATCH on collection gets 403 Forbidden -- Key: JCR-1071 URL: https://issues.apache.org/jira/browse/JCR-1071 Project: Jackrabbit Issue Type: Bug Components: webdav Affects Versions: 1.

Re: Multiple connections (Was: Jackrabbit, the database)

2007-08-21 Thread Jukka Zitting
Hi, On 8/21/07, Thomas Mueller <[EMAIL PROTECTED]> wrote: > > Any reasonable connection pool will pool also the prepared statements, > > I would not be so sure. Maybe if you don't close the prepared > statement (not sure about the disadvantages of that). In any case it > needs to be tested. At le

Re: Multiple connections (Was: Jackrabbit, the database)

2007-08-21 Thread Thomas Mueller
> PreparedStatement ps = connection.prepareStatement(...); > try { > // use the prepared statement > } finally { > ps.close(); > } > Any reasonable connection pool will pool also the prepared statements, I would not b

Re: Multiple connections (Was: Jackrabbit, the database)

2007-08-21 Thread Jukka Zitting
Hi, On 8/21/07, Thomas Mueller <[EMAIL PROTECTED]> wrote: > Reusing prepared statements is harder with pooled connections. In JDK > 1.6, thanks to javax.sql.StatementEventListener, they can if the > connection pool manager supports it. In JDK 1.4 and 1.5, I am not sure > what / when connection poo

RE: IndexingConfiguration jr 1.4 release, analyzing, searching and synonymprovider

2007-08-21 Thread Ard Schrijvers
> On 8/21/07, Ard Schrijvers <[EMAIL PROTECTED]> wrote: > > > ... > > value="org.apache.lucene.analysis.fr.FrenchAnalyzer"/> > > value="org.apache.lucene.analysis.de.GermanAnalyzer"/> > > > > > > > > bode_fr > >bode_de > > ... > > I prefer this variant, where

[jira] Commented: (JCR-1070) Promotion of SPI from Contrib

2007-08-21 Thread angela (JIRA)
[ https://issues.apache.org/jira/browse/JCR-1070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12521457 ] angela commented on JCR-1070: - > My only wish about structure is that we get rid of the subproject... sure. that's what i

[jira] Commented: (JCR-1070) Promotion of SPI from Contrib

2007-08-21 Thread Jukka Zitting (JIRA)
[ https://issues.apache.org/jira/browse/JCR-1070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12521453 ] Jukka Zitting commented on JCR-1070: +1 Feel free to go forward! My only wish about structure is that we get rid o

Re: IndexingConfiguration jr 1.4 release, analyzing, searching and synonymprovider

2007-08-21 Thread Bertrand Delacretaz
On 8/21/07, Ard Schrijvers <[EMAIL PROTECTED]> wrote: > ... > value="org.apache.lucene.analysis.fr.FrenchAnalyzer"/> > value="org.apache.lucene.analysis.de.GermanAnalyzer"/> > > > > bode_fr >bode_de > ... I prefer this variant, where you define reusable analyz

RE: IndexingConfiguration jr 1.4 release, analyzing, searching and synonymprovider

2007-08-21 Thread Ard Schrijvers
> Ard Schrijvers wrote: > > > > value="org.apache.lucene.analysis.fr.FrenchAnalyzer"/> > > value="org.apache.lucene.analysis.de.GermanAnalyzer"/> > > > > > > > > bode_fr > >bode_de > > > Marcel Reutegger wrote: > I like this one. Should I wait with a JIRA issue to see

Re: IndexingConfiguration jr 1.4 release, analyzing, searching and synonymprovider

2007-08-21 Thread Marcel Reutegger
Ard Schrijvers wrote: bode_fr bode_de I like this one. regards marcel

RE: Multiple connections (Was: Jackrabbit, the database)

2007-08-21 Thread Michael Roberts
> A single DB connection can only process a single operation at a time. > Jackrabbit locks up completely while storing (example: a larger binary) > - not only for reading but also for writing. > There has been some improvements AFAIK, but it still applies for write > operations. > A simple insert o

RE: IndexingConfiguration jr 1.4 release, analyzing, searching and synonymprovider

2007-08-21 Thread Ard Schrijvers
Sorry, I accidentely sent the mail while not finished: > Ok. If others agree, I will create a JIRA improvement issue > for it. I be able to implement it. When succeeded, I can add > documentation. The only thing I would need feedback on is how > people would like to see it in the index_configur

RE: IndexingConfiguration jr 1.4 release, analyzing, searching and synonymprovider

2007-08-21 Thread Ard Schrijvers
> Marcel Reutegger wrote: > sounds good. We just have to document the limitations properly. Ok. If others agree, I will create a JIRA improvement issue for it. I be able to implement it. When succeeded, I can add documentation. The only thing I would need feedback on is how people would like to

RE: improving the scalability in searching

2007-08-21 Thread Ard Schrijvers
> Ard Schrijvers wrote: > > So, WDOT about indexing properties in seperate lucene > Fields, and about > > possibly indexing more information of one property. > > Marcel Reutegger wrote: > Because the number of distinct property names in jackrabbit > is unlimited (think > of nt:unstructured node

[jira] Commented: (JCR-1066) Exclude system index for queries that restrict the result set to nodetypes not availble in the "jcr:system" subtree

2007-08-21 Thread Marcel Reutegger (JIRA)
[ https://issues.apache.org/jira/browse/JCR-1066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12521417 ] Marcel Reutegger commented on JCR-1066: --- There's another potential issue. I know of at least one implementation t

Re: IndexingConfiguration jr 1.4 release, analyzing, searching and synonymprovider

2007-08-21 Thread Marcel Reutegger
Ard Schrijvers wrote: Now, Marcel correctly pointed about the problem when referring to the node scope search: //*[jcr:contains(., 'hägar')] As I have given it another thought, it does make sense to me, that when searching in the node scope, you use the global/default analyzer, and when you are

Re: improving the scalability in searching

2007-08-21 Thread Marcel Reutegger
Ard Schrijvers wrote: So, WDOT about indexing properties in seperate lucene Fields, and about possibly indexing more information of one property. Because the number of distinct property names in jackrabbit is unlimited (think of nt:unstructured nodes), this would lead to a great number of file

[jira] Commented: (JCR-1070) Promotion of SPI from Contrib

2007-08-21 Thread Julian Reschke (JIRA)
[ https://issues.apache.org/jira/browse/JCR-1070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12521409 ] Julian Reschke commented on JCR-1070: - I'm in favor of this. If there's anything specific I can do to help, please

Re: improving the scalability in searching

2007-08-21 Thread Marcel Reutegger
Christoph Kiehl wrote: Marcel Reutegger wrote: 1) New QueryHandler class 2) Introduce parameter in configuration 3) Auto-detect in SearchIndex I prefer 1) because it makes it explicit. I have reservations regarding 3) because it introduces some magic. I don't like 2) because we probably cann

Re: Multiple connections (Was: Jackrabbit, the database)

2007-08-21 Thread Thomas Mueller
Hi, > a larger binary > A simple insert of 10MB will look up JR for few seconds, this is quite a > problem IMO. Sounds like multiple connections would help here... I didn't really think about large binaries so far, because I thought they are usually processed by the blob store (or global data sto

Re: Total size of a query result and setLimit()

2007-08-21 Thread Marcel Reutegger
Ard Schrijvers wrote: Sorry for interfering with this performance thing, but it is very similar problem to our lucene impl we build on top of slide some time ago (the index was not ACL aware, so how to know the actual number of hits which are visible for a user without having to test all nodes),

Re: Multiple connections (Was: Jackrabbit, the database)

2007-08-21 Thread Marcel May
Thomas Mueller wrote: >> avoid the synchronization on the PreparedStatements >> > > I don't think that synchronization on prepared statements is a > bottleneck. But you can prove that I am wrong. If writing the > changelog is synchronized (not sure if it is), that would be a > bottleneck. > >

[jira] Created: (JCR-1070) Promotion of SPI from Contrib

2007-08-21 Thread angela (JIRA)
Promotion of SPI from Contrib - Key: JCR-1070 URL: https://issues.apache.org/jira/browse/JCR-1070 Project: Jackrabbit Issue Type: Task Reporter: angela Priority: Minor This has been sugge

Re: Multiple connections (Was: Jackrabbit, the database)

2007-08-21 Thread Thomas Mueller
> avoid the synchronization on the PreparedStatements I don't think that synchronization on prepared statements is a bottleneck. But you can prove that I am wrong. If writing the changelog is synchronized (not sure if it is), that would be a bottleneck. Thomas

Re: Jackrabbit, the database

2007-08-21 Thread David Nuescheler
Hi Michi, > well, I want my local filesystem accessible via JCR and allowing me to > change files still with "vi(m)" from time to time. > Or do you consider that covered by A, B or C? But I guess my usecase is > probably neglectable as a usecase ... I usually use a webdav or an smb mount to be abl

Re: Jackrabbit, the database

2007-08-21 Thread Bertrand Delacretaz
On 8/21/07, Michael Wechner <[EMAIL PROTECTED]> wrote: > well, I want my local filesystem accessible via JCR and allowing me to > change files still with "vi(m)" from time to time This works, whatever storage mechanism is used, but in the opposite way: you can mount a Jackrabbit repositor

Re: Jackrabbit, the database

2007-08-21 Thread Michael Wechner
David Nuescheler wrote: hi all, i can appreciate both positions, looking at jackrabbit as the datastore or looking at jackrabbit as running on top of a datastore (rdbms). personally, i don't believe that the latter perception will go away for quite a while, so i think jackrabbit should suppor

Multiple connections (Was: Jackrabbit, the database)

2007-08-21 Thread Jukka Zitting
Hi, [Branching the thread] On 8/21/07, Thomas Mueller <[EMAIL PROTECTED]> wrote: > My question is: should we support multiple connections in the > persistence manager? > > If using multiple connections really does improve performance / > scalability of Jackrabbit, I think we should think about it

Re: Jackrabbit, the database

2007-08-21 Thread Thomas Mueller
Hi, For me, the question is not really databases or not (databases need to be supported, while native storage can be much faster). My question is: should we support multiple connections in the persistence manager? If using multiple connections really does improve performance / scalability of Jac

Re: Jackrabbit, the database

2007-08-21 Thread Jukka Zitting
Hi, On 8/20/07, Thomas Mueller <[EMAIL PROTECTED]> wrote: > > management won't. > > political reasons. > > won't move to Jackrabbit *if* Jackrabbit cannot store it in oracle. > > I agree. My guess is about 50% of larger organizations want a > databases as the backend, even if databases are slower.

[jira] Commented: (JCR-996) Name and Path interfaces in SPI

2007-08-21 Thread angela (JIRA)
[ https://issues.apache.org/jira/browse/JCR-996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12521347 ] angela commented on JCR-996: Follow up discussion 'Distribution of commons classes': http://www.mail-archive.com/dev@jackra

Re: Jackrabbit, the database

2007-08-21 Thread David Nuescheler
hi all, i can appreciate both positions, looking at jackrabbit as the datastore or looking at jackrabbit as running on top of a datastore (rdbms). personally, i don't believe that the latter perception will go away for quite a while, so i think jackrabbit should support both views. in my experi