Hi.

>> * doesn't synchronizing the addRecord() method, and using only one 
>> connection defeat one of the
>> purposes of the data store of allowing maximum concurrency?
>
> Yes that's true. Using the data store itself improves concurrency as
> simple (non-blob) repository operations are not blocked by operations
> that involve blobs. Using multiple connections could improve
> concurrency and could even speed up the process (if the database
> writes to multiple hard drives). So far I have not thought about that.

While it's true that just having the data store offloads a lot from Jackrabbit, 
I think it's not 
enough.

> The question is: how important is this feature?

Well, for me it's crucial. As you can see, the code I posted uses multiple 
connections, possibly 
from a connection pool.
If this feature is not as important as I think it is, then I still think that 
we could, and should 
at least not prevent an easy extension that could add this feature. Currently, 
having all methods 
and fields private means that in order to extend the DB data store you have to 
reimplement every 
method.

>> * making the SQL strings private and not initializing them in a method of 
>> its own really 
>> complicates
>> extending the implementation
>
> Sorry I have committed the properties files to the wrong folder first!
> I have fixed it now. The SQL statements can be overloaded in the
> <databaseType>.properties file in
> src/main/resources/org/apache/jackrabbit/core/data/db. Currently they
> are not overloaded, but maybe they need to be. I have only tested
> derby and H2 so far. initDatabaseType() loads the properties file.

I tested with SQL Server 2005 with the default statements, and they don't work 
because the BLOB data 
type is called IMAGE there. I changed it and everything else works fine.

>> (in any case, the SQL strings should be written as "UPDATE " + tableSQL
>> + "DATASTORE SET DATA=? WHERE ID=?")
>
> Both the table name and the SQL strings can be overloaded (in the
> properties file), so building the SQL statements is not required in my
> view.

No, it's not required, but if all you want to do is change the table name, I 
think it's too much 
having to re-write all the statements. Same goes for something as simple as 
changing BLOB to IMAGE 
in only one statement.

>> * during a Session.save() there are various calls to DbDataStore.getRecord() 
>> and
>> DbDataRecord.getStream(), for storing the blob int the blobStore. Why is 
>> this necesary if the 
>> binary
>> content is already in the data store? It seems that this copy is overwritten 
>> every time, but I 
>> don't
>> see the reason for all this calls to the DB, and file copies.
>
> That's not good. I like to solve this problem. Does this occur when
> simply storing a node with a large object? If not, do you have a
> simple test case?

Just doing the following in a class that extends AbstractJCRTest is enough

Session session = helper.getSuperuserSession();
Node root = session.getRootNode();
root.setProperty("notice", new FileInputStream("NOTICE.txt"));
session.save();

The save() causes two getRecord() and one getStream() call. And the stream is 
copied to the 
filesystem in a temp file.

Another thing I noticed is that the code calls usesIdentifier() at the end of 
addRecord(). Shouldn't 
be better to call it as soon as the definitive identifier is available? In case 
the GC happens to 
run in between?
I'll keep playing with it and report if I find anything.
Regards,

Esteban Franqueiro
[EMAIL PROTECTED] 


Notice:  This email message, together with any attachments, may contain 
information  of  BEA Systems,  Inc.,  its subsidiaries  and  affiliated 
entities,  that may be confidential,  proprietary,  copyrighted  and/or legally 
privileged, and is intended solely for the use of the individual or entity 
named in this message. If you are not the intended recipient, and have received 
this message in error, please immediately return this by email and then delete 
it.

Reply via email to