Re: Faster reference binary handling
It's quite likely the blob is being downloaded from S3 as Thomas mentioned. The fixes done only affect the DSGC code paths. There is a S3DataStore property - proactiveCaching which you can turn off as it is true by default. Thanks Amit On Fri, Sep 16, 2016 at 12:13 PM, Chetan Mehrotrawrote: > I think we fixes have been recently done in this area. However it > would be good to have an integration test for reference check scenario > to ensure that it unnecessarily does not download the blobs > Chetan Mehrotra > > > On Fri, Sep 16, 2016 at 11:56 AM, Thomas Mueller > wrote: > > Hi, > > > > Possibly the binary is downloaded from S3 in this case. We have seen > > similar performance issues with datastore GC when using the S3 datastore. > > > > It should be possible to verify this with full thread dumps. Plus we > would > > see where exactly the download occurs. Maybe it is checking the length or > > so. > > > >> this API requires Oak to always retrieve the binary value from the DS > > > > I think the problem is in the S3 datastore implementation, and not the > > API. But lets see. > > > > Regards, > > Thomas > > > > > > On 15/09/16 18:04, "Tommaso Teofili" wrote: > > > >>Hi all, > >> > >>while working with Oak S3 DS I have witnessed slowness (no numbers, just > >>'slow' from a user perspective) in persisting a binary using its > >>reference; > >>although this may be related to some environment specific issue I > wondered > >>about the reference binary handling we introduced in JCR-3534 [1]. > >>In fact the implementation there requires to do something like > >> > >>ReferenceBinary ref = new SimpleReferenceBinary(referenceString); > >>Binary referencedBinary = > >>session.getValueFactory().createValue(ref).getBinary(); > >>node.setProperty("foo", referencedBinary); > >> > >>on the "installation" side. > >>Despite all possible issues in the implementation it seems this API > >>requires Oak to always retrieve the binary value from the DS and then > >>store > >>its value into the node whereas it'd be much better to avoid having to > >>read > >>the value but instead bind it to that referenced binary. > >> > >>ReferenceBinary ref = new SimpleReferenceBinary(referenceString); > >>if (ref.isValid()) { // referenced binary exists in the DS > >> node.setProperty("foo", ref, Type.BINARY); // set a string with binary > >>type !? > >>} > >> > >>I am not sure if the above code could make sense, probably not, but at > >>least wanted to point out the problem as to seek for possible > >>enhancements. > >> > >>Regards, > >>Tommaso > >> > >>[1] : https://issues.apache.org/jira/browse/JCR-3534 > > >
Re: Faster reference binary handling
I think we fixes have been recently done in this area. However it would be good to have an integration test for reference check scenario to ensure that it unnecessarily does not download the blobs Chetan Mehrotra On Fri, Sep 16, 2016 at 11:56 AM, Thomas Muellerwrote: > Hi, > > Possibly the binary is downloaded from S3 in this case. We have seen > similar performance issues with datastore GC when using the S3 datastore. > > It should be possible to verify this with full thread dumps. Plus we would > see where exactly the download occurs. Maybe it is checking the length or > so. > >> this API requires Oak to always retrieve the binary value from the DS > > I think the problem is in the S3 datastore implementation, and not the > API. But lets see. > > Regards, > Thomas > > > On 15/09/16 18:04, "Tommaso Teofili" wrote: > >>Hi all, >> >>while working with Oak S3 DS I have witnessed slowness (no numbers, just >>'slow' from a user perspective) in persisting a binary using its >>reference; >>although this may be related to some environment specific issue I wondered >>about the reference binary handling we introduced in JCR-3534 [1]. >>In fact the implementation there requires to do something like >> >>ReferenceBinary ref = new SimpleReferenceBinary(referenceString); >>Binary referencedBinary = >>session.getValueFactory().createValue(ref).getBinary(); >>node.setProperty("foo", referencedBinary); >> >>on the "installation" side. >>Despite all possible issues in the implementation it seems this API >>requires Oak to always retrieve the binary value from the DS and then >>store >>its value into the node whereas it'd be much better to avoid having to >>read >>the value but instead bind it to that referenced binary. >> >>ReferenceBinary ref = new SimpleReferenceBinary(referenceString); >>if (ref.isValid()) { // referenced binary exists in the DS >> node.setProperty("foo", ref, Type.BINARY); // set a string with binary >>type !? >>} >> >>I am not sure if the above code could make sense, probably not, but at >>least wanted to point out the problem as to seek for possible >>enhancements. >> >>Regards, >>Tommaso >> >>[1] : https://issues.apache.org/jira/browse/JCR-3534 >
Re: Faster reference binary handling
Hi, Possibly the binary is downloaded from S3 in this case. We have seen similar performance issues with datastore GC when using the S3 datastore. It should be possible to verify this with full thread dumps. Plus we would see where exactly the download occurs. Maybe it is checking the length or so. > this API requires Oak to always retrieve the binary value from the DS I think the problem is in the S3 datastore implementation, and not the API. But lets see. Regards, Thomas On 15/09/16 18:04, "Tommaso Teofili"wrote: >Hi all, > >while working with Oak S3 DS I have witnessed slowness (no numbers, just >'slow' from a user perspective) in persisting a binary using its >reference; >although this may be related to some environment specific issue I wondered >about the reference binary handling we introduced in JCR-3534 [1]. >In fact the implementation there requires to do something like > >ReferenceBinary ref = new SimpleReferenceBinary(referenceString); >Binary referencedBinary = >session.getValueFactory().createValue(ref).getBinary(); >node.setProperty("foo", referencedBinary); > >on the "installation" side. >Despite all possible issues in the implementation it seems this API >requires Oak to always retrieve the binary value from the DS and then >store >its value into the node whereas it'd be much better to avoid having to >read >the value but instead bind it to that referenced binary. > >ReferenceBinary ref = new SimpleReferenceBinary(referenceString); >if (ref.isValid()) { // referenced binary exists in the DS > node.setProperty("foo", ref, Type.BINARY); // set a string with binary >type !? >} > >I am not sure if the above code could make sense, probably not, but at >least wanted to point out the problem as to seek for possible >enhancements. > >Regards, >Tommaso > >[1] : https://issues.apache.org/jira/browse/JCR-3534
Faster reference binary handling
Hi all, while working with Oak S3 DS I have witnessed slowness (no numbers, just 'slow' from a user perspective) in persisting a binary using its reference; although this may be related to some environment specific issue I wondered about the reference binary handling we introduced in JCR-3534 [1]. In fact the implementation there requires to do something like ReferenceBinary ref = new SimpleReferenceBinary(referenceString); Binary referencedBinary = session.getValueFactory().createValue(ref).getBinary(); node.setProperty("foo", referencedBinary); on the "installation" side. Despite all possible issues in the implementation it seems this API requires Oak to always retrieve the binary value from the DS and then store its value into the node whereas it'd be much better to avoid having to read the value but instead bind it to that referenced binary. ReferenceBinary ref = new SimpleReferenceBinary(referenceString); if (ref.isValid()) { // referenced binary exists in the DS node.setProperty("foo", ref, Type.BINARY); // set a string with binary type !? } I am not sure if the above code could make sense, probably not, but at least wanted to point out the problem as to seek for possible enhancements. Regards, Tommaso [1] : https://issues.apache.org/jira/browse/JCR-3534