Re: Workspace.copy() Question ...

2008-11-19 Thread Thomas Müller
Hi, > extended the Value interface instead of InputStream. That would work as well: JackrabbitValue. > if we make it like you wrote ... > every module must handle this internally. Yes. The modules can use standard JCR event listeners, so they are backward compatible and implementation independe

AW: Workspace.copy() Question ...

2008-11-18 Thread KÖLL Claus
some more thoughts ... thomas, if we make it like you wrote ... class VirusScanner { public void scan(InputStream in) throws VirusFoundException { if(in instanceof DataStoreInputStream) { DataIdentifier di = ((DataStoreInputStream) in).getDataIdentifier(); if (

AW: Workspace.copy() Question ...

2008-11-18 Thread KÖLL Claus
hi guys, for my understanding ... >Probably not in the Lucene index files itself. Text extraction could be used >without using the Lucene index, for example to display the text content of a >>PDF file. The text extraction module could store the DataIdentifier together >with the extracted text

Re: Workspace.copy() Question ...

2008-11-17 Thread Jukka Zitting
Hi, On Mon, Nov 17, 2008 at 11:07 AM, Thomas Müller <[EMAIL PROTECTED]> wrote: > Currently we don't detect that the binary already exists when using > the regular JCR API. We do for things like workspace.copy(...) or propertyA.setValue(propertyB.getValue()). The only case where we don't do that i

Re: Workspace.copy() Question ...

2008-11-17 Thread Thomas Müller
Hi, >> Exactly. The DataStore should also check if the InputStream is a >> DataStoreInputStream, so maybe it doesn't need to copy the binary: > > IMHO we should (and currently do) handle that on a higher level, by > tracking the DataIdentifier in InternalValue. Currently we don't detect that the

Re: Workspace.copy() Question ...

2008-11-17 Thread Jukka Zitting
Hi, On Mon, Nov 17, 2008 at 10:02 AM, Thomas Müller <[EMAIL PROTECTED]> wrote: >> But what will you do in the case if you try to copy >> a node internaly .. the datastore should know that he must not read the >> binary >> to prevent extra read and write to the datastore. > > Exactly. The DataStor

Re: Workspace.copy() Question ...

2008-11-17 Thread Thomas Müller
Hi, > would you store the dataidentifier in the index > and so in all modules ? Probably not in the Lucene index files itself. Text extraction could be used without using the Lucene index, for example to display the text content of a PDF file. The text extraction module could store the DataIdenti

AW: Workspace.copy() Question ...

2008-11-13 Thread KÖLL Claus
Hi Thomas, >Instead of returning an InputStream, Jackrabbit would return a >DataStoreInputStream with the additional method getDataIdentifier(). >Then the module can read the identifier, check if the item is already >processed, and avoid reading the data itself if this identifier is >already proce

Re: Workspace.copy() Question ...

2008-11-12 Thread Thomas Müller
Hi, The problem is: "process the binary only once". With 'process' we said 'text extraction', but it could be 'virus scan', 'index', 'create a thumbnail', 'transfer' (to the client or from the client), or 'backup' - any expensive task. I believe a good solution is to provide the object identity t

Re: Workspace.copy() Question ...

2008-11-12 Thread Jukka Zitting
Hi, On Tue, Nov 11, 2008 at 10:06 AM, Thomas Müller <[EMAIL PROTECTED]> wrote: > It's an interesting use case, and probably quite common. It would be > good if the text extraction would be run only once for each binary. > However I'm not sure how this should be implemented... One solution is > to

AW: Workspace.copy() Question ...

2008-11-11 Thread KÖLL Claus
ue to write down the problems .. greets claus -Ursprüngliche Nachricht- Von: Thomas Müller [mailto:[EMAIL PROTECTED] Gesendet: Dienstag, 11. November 2008 10:07 An: dev@jackrabbit.apache.org Betreff: Re: Workspace.copy() Question ... Hi, > i have a nice usecase .. i have a fileno

Re: Workspace.copy() Question ...

2008-11-11 Thread Thomas Müller
Hi, > i have a nice usecase .. i have a filenode in my workspace and i should create > about 70 copies of this node. > its a not so small pdf file (10Mb) and i am using the datastore so its no > problem > the binary exists only one time but the problem is the textextractor. it will > be called 7

Workspace.copy() Question ...

2008-11-11 Thread KÖLL Claus
hi there ... i have a nice usecase .. i have a filenode in my workspace and i should create about 70 copies of this node. its a not so small pdf file (10Mb) and i am using the datastore so its no problem the binary exists only one time but the problem is the textextractor. it will be call