So there are two problems:

1) 409 when deleting objects.
2) Transactions taking longer after 24-48 hours.

For (1), it looks like the request reached the Swift cluster but the
Swift cluster itself wasn't able to fulfill it. This could be because
of the "eventual consistency" semantics of blobstores. When the delete
request reached Swift, it could have been in the middle of some
operation on the object itself (e.g. reading the object for
replicating it, auditing it, etc). Jclouds did it's job of actually
sending the request. So not sure what else can be done here. Maybe we
could add retries if the blobstore returns 409. But the main problem
lies on the Swift side. The Openstack mailing list would be a better
place for asking this question. There are many more Swift experts
there.

For (2), from the curl example code, it looks like you're creating
multiple processes, each doing a put or a delete (no get). This is
different from jclouds spawning multiple threads. It would be great if
the experiments count the number of transactions they're doing and
whether they both reach the same number of transactions in the given
amount of time. If they do and yet there are less txns via jclouds
compared to the shell script, we can conclude that jclouds is the
cause of the problem.

Now, answering some of the questions below.

> It would be great if someone let me know how jcloud delete works. Is there
> any internal queue while put or delete ? I saw if I put a small sleep of
> 300ms between put n del call, it works fine.

I presume the blobstore object you're using in Example9.blobStore is
of type "BlobStore" and not "AsyncBlobStore". AsyncBlobStore is
deprecated. The BlobStore object is synchronous. There is no queue.
When you call removeBlob, the request gets created and sent to the
Swift cluster.

> Also I assume that jclouds calls are synchronous one n put could not come
> out till object get saved in swift.

For the BlobStore type, yes, it is sync.

There are some jvm level settings that might also be at play here
related to the amount of memory you're allocating to the heap. You
could change the memory given to the jvm using the -Xms and -Xmx
options.

-Shri

>  On Apr 22, 2014 11:59 AM, "Sumit Gaur" <[email protected]> wrote:
>
>> Hi
>> Please find my answer below
>>
>> On Apr 22, 2014 10:49 AM, "Jasdeep Hundal" <[email protected]>
>> wrote:
>> >
>> > Hey Sumit,
>> >
>> > I have a couple more questions that might help clarify the situation:
>> >
>> > 1. Are you running the stability test as a single long running Java
>> process
>> > (that just keeps cycling through the 10 uploads/gets/deletes)?
>> >
>>
>> Yes. But this process has threads.
>>
>> > 2. Are you always running the test in the same container, or are you
>> > creating new containers for each test iteration?
>> >
>> No, I am doing roundrobin in 1000 containers
>>
>> > 3. If the answer to #2 is is that the test runs in a single container,
>> how
>> > many objects does that container currently have?
>> >
>>
>> 0 in ideal case. But as I m facing 409 delete fail also... so there are
>> some objects on each container in hundreds only.
>>
>> > It may also help to time each of the individual blobstore actions as you
>> > run the test to see if any particular one is slowing down.
>> >
>>
>> Even indivitual put and del time increase over the time.
>>
>> > Jasdeep
>> >
>> >
>> > On Mon, Apr 21, 2014 at 6:21 PM, Sumit Gaur <[email protected]>
>> wrote:
>> >
>> > > hi Shri,
>> > > Please find answers below
>> > >
>> > > On Tue, Apr 22, 2014 at 9:23 AM, Shrinand Javadekar <
>> > > [email protected]
>> > > > wrote:
>> > > Few more questions to try and understand this better:
>> > >
>> > > 1) On the Swift instance you are using, how many replicas do you have?
>> > >
>> > > 3 replica
>> > >
>> > > 2) Also, how are you using the curl command in the shell script?
>> > >
>> > > send below command in backgroud for 10 iterations and wait similiar to
>> the
>> > > 10 threads in jclouds.
>> > >
>> > >             curl -X PUT -i -T 100k -H "X-Auth-Token: $OS_AUTH_TOKEN"
>> > > http://
>> > >
>> > >
>> $PROXY_LOCAL_NET_IP:80/v1/AUTH_${KEYSTONE_ID}/zest1-${cn}/zest1-${k}-${i}-${j}.txt
>> > >             curl -X DELETE -i -H "X-Auth-Token: $OS_AUTH_TOKEN" http://
>> > >
>> > >
>> $PROXY_LOCAL_NET_IP:80/v1/AUTH_${KEYSTONE_ID}/zest1-${cn}/zest1-${k}-${i}-${j}.txt
>> > >
>> > > I
>> > > think the shell script and jclouds-with-10-parallel-threads may not be
>> > > doing the same amount of work. In 20 hours jclouds might be doing much
>> > > more work than the shell script. If you let the shell script also go
>> > > upto that point, it might see failures too. Do you know how many
>> > > PUT-GET-DEL operations have been performed when you start seeing the
>> > > 409 errors.
>> > >
>> > > Actually 409 errors are coming since the start of the test but TPS
>> start
>> > > degrading after 24-48 hours.
>> > > On Apr 22, 2014 9:23 AM, "Shrinand Javadekar" <[email protected]
>> >
>> > > wrote:
>> > >
>> > > > Few more questions to try and understand this better:
>> > > >
>> > > > 1) On the Swift instance you are using, how many replicas do you
>> have?
>> > > > 2) Also, how are you using the curl command in the shell script? I
>> > > > think the shell script and jclouds-with-10-parallel-threads may not
>> be
>> > > > doing the same amount of work. In 20 hours jclouds might be doing
>> much
>> > > > more work than the shell script. If you let the shell script also go
>> > > > upto that point, it might see failures too. Do you know how many
>> > > > PUT-GET-DEL operations have been performed when you start seeing the
>> > > > 409 errors.
>> > > >
>> > > > -Shri
>> > > >
>> > > >
>> > > > On Mon, Apr 21, 2014 at 4:55 PM, Sumit Gaur <[email protected]>
>> > > wrote:
>> > > > > FYI ..This is block of code .....   also I am using jclouds 1.7.1
>> > > (Stable
>> > > > > branch)
>> > > > >      try {
>> > > > > String key = "objkey" + UUID.randomUUID();
>> > > > >                 Blob blob =
>> > > > > Example9.blobStore.blobBuilder(key).payload(Example9.file).build();
>> > > > >
>> > > Example9.blobStore.putBlob(Example9.containerName+count,
>> > > > > blob);
>> > > > >
>> > > Example9.blobStore.getBlob(Example9.containerName+count,
>> > > > > key);
>> > > > >
>> > > > Example9.blobStore.removeBlob(Example9.containerName+count,
>> > > > > key);
>> > > > >         } catch (Exception ace) {
>> > > > >                 System.out.println("Request failed for objkey " +
>> key
>> > > + "
>> > > > >  " + ace);
>> > > > >         }
>> > > > >
>> > > > >
>> > > > >
>> > > > > On Tue, Apr 22, 2014 at 8:32 AM, Sumit Gaur <[email protected]>
>> > > > wrote:
>> > > > >
>> > > > >> Hi Shri,
>> > > > >> Thanks for paying attention to it, Please find my answers below:-
>> > > > >>
>> > > > >>
>> > > > >> On Tue, Apr 22, 2014 at 2:31 AM, Shrinand Javadekar <
>> > > > >> [email protected]> wrote:
>> > > > >>
>> > > > >>> Sumit,
>> > > > >>>
>> > > > >>> I realize that you had sent out a similar email sometime ago
>> about
>> > > > >>> performance degradation. I'm not sure if anyone has run these
>> types
>> > > of
>> > > > >>> long running experiments with jclouds. So this may be a first.
>> > > > >>>
>> > > > >> Tried to debug it in last 2 weeks without success. Want to
>> understand
>> > > > more
>> > > > >> how jclouds code handle this use case or any pointers that this
>> is a
>> > > > >> problematic use case would help
>> > > > >>
>> > > > >>>
>> > > > >>> The 409 status is returned because of a conflict [1]. Are you
>> sure
>> > > you
>> > > > >>> didn't have two or more threads trying to delete the same object?
>> > > > >>>
>> > > > >> No two threads share the same object key in my programme (String
>> key =
>> > > > >> "objkey" + UUID.randomUUID();). It is some kind of race between
>> PUT
>> > > and
>> > > > >> DEL call . If I put say 10 ms sleep between call then there is no
>> 409
>> > > > error.
>> > > > >>
>> > > > >>
>> > > > >>> Also, I see that that 409 is returned by Swift if you try to
>> delete a
>> > > > >>> container that isn't empty[2]. Is that something your test code
>> > > > >>> could've tried?
>> > > > >>>
>> > > > >> I am trying to delete objects .. not containers.
>> > > > >>
>> > > > >>>
>> > > > >>> When you say there was a similar test you're trying with curl,
>> are
>> > > you
>> > > > >>> using the curl command-line utility or the libcurl library?
>> > > > >>
>> > > > >> curl command in shell script with for loops.
>> > > > >>
>> > > > >>
>> > > > >>> How are
>> > > > >>> you specifying the number of threads to use and what object each
>> > > > >>> thread should get/put/delete?
>> > > > >>>
>> > > > >>
>> > > > >> It is a java test programme using ThreadPoolExecutor. Somthing
>> > > similiar
>> > > > as
>> > > > >> here
>> > > > >>
>> > > > >>
>> > > >
>> > >
>> http://www.javacodegeeks.com/2013/01/java-thread-pool-example-using-executors-and-threadpoolexecutor.html
>> > > > >>
>> > > > >> Object is a 5KB file. with  key = "objkey" + UUID.randomUUID();
>> with
>> > > > Pool
>> > > > >> of 10  threads.
>> > > > >>
>> > > > >>
>> > > > >> Hope this would give a good inside. Let me know if you get any
>> problem
>> > > > >> here.
>> > > > >>
>> > > > >>
>> > > > >>>
>> > > > >>> Thanks.
>> > > > >>> -Shri
>> > > > >>>
>> > > > >>> [1] http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html
>> > > > >>> [2] https://bugs.launchpad.net/horizon/+bug/1096084
>> > > > >>>
>> > > > >>> On Sun, Apr 20, 2014 at 5:55 PM, Sumit Gaur <
>> [email protected]>
>> > > > wrote:
>> > > > >>> > Hi
>> > > > >>> > I using jclouds lib integrated with Openstack Swift+ keystone
>> > > > >>> combinaiton.
>> > > > >>> > Things are working fine except stability test. After 20-30
>> hours of
>> > > > test
>> > > > >>> > jclouds/SWIFT start degrading in TPS and keep going down over
>> the
>> > > > time.
>> > > > >>> >
>> > > > >>> > 1) I am running the (PUT-GET-DEL) cycle in 10 parallel threads.
>> > > > >>> > 2) I am getting a lot of 409 and DEL failure for the as
>> response
>> > > too
>> > > > >>> from
>> > > > >>> > SWIFT.
>> > > > >>> > 3) Direct similiar test from curl does not show much impact
>> and TPS
>> > > > >>> remain
>> > > > >>> > constant.
>> > > > >>> >
>> > > > >>> > Can sombody help me wht is going wrong here ?
>> > > > >>> >
>> > > > >>> > Thanks
>> > > > >>> > sumit
>> > > > >>>
>> > > > >>
>> > > > >>
>> > > >
>> > >
>>

Reply via email to