Re: [jclouds] Delete objects in a container efficiently. (#214)

2014-03-26 Thread CloudBees pull request builder plugin
[jclouds-java-7-pull-requests 
#1165](https://jclouds.ci.cloudbees.com/job/jclouds-java-7-pull-requests/1165/) 
SUCCESS
This pull request looks good

---
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/214#issuecomment-38656006

Re: [jclouds] Delete objects in a container efficiently. (#214)

2014-03-26 Thread Andrew Gaul
 }
  
 -   private boolean parentIsFolder(final ListContainerOptions options, final 
 StorageMetadata md) {
 -  return options.getDir() != null  md.getName().indexOf('/') == -1;
 +   private void waitForCompletion(final Semaphore semaphore,
 + final SetListenableFutureVoid outstandingFutures) {
 +  // Wait for all futures to complete by waiting to acquire all
 +  // semaphores.
 +  try {
 + // TODO: Each individual blob delete operation itself has a time
 + // limit of 'maxTime'. Therefore having a time limit for this
 + // semaphore acquisition provides little value. This could be
 + // removed.

I removed the timeout as the comment suggests since we can easily encounter 
this situation with large containers.

---
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/214/files#r10997822

Re: [jclouds] Delete objects in a container efficiently. (#214)

2014-03-26 Thread Andrew Gaul
I committed to master with some Checkstyle, Javadoc, and indentation fixes.  
Let's address some of the TODOs and test more before backporting to 1.7.x.  
Thank you for your contribution @shrinandj !  The previous implementation has 
troubled us for some years now.

---
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/214#issuecomment-38737591

Re: [jclouds] Delete objects in a container efficiently. (#214)

2014-03-26 Thread Andrew Gaul
Closed #214.

---
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/214

Re: [jclouds] Delete objects in a container efficiently. (#214)

2014-03-25 Thread Andrew Gaul
 + final String fullPath = parentIsFolder(options, md) ? 
 options.getDir()
 +   + / + md.getName() : md.getName();
 +
 +// Attempt to acquire a semaphore within the time limit. At least
 +// one outstanding future should complete within this period for 
 the
 +// semaphore to be acquired.
 + try {
 +if (!semaphore.tryAcquire(maxTime, TimeUnit.MILLISECONDS)) {
 +   throw new TimeoutException(Timeout waiting for semaphore);
 +}
 + } catch (InterruptedException ie) {
 +logger.debug(Interrupted while deleting blobs);
 +Thread.currentThread().interrupt();
 + }
 +
 + final ListenableFutureVoid blobDelFuture;

If you wrap ```blobDelFuture``` in an ```Optional``` will this remove some of 
the duplicated null assignments?

---
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/214/files#r10940709

Re: [jclouds] Delete objects in a container efficiently. (#214)

2014-03-24 Thread Andrew Gaul
 + switch (md.getType()) {
 + case BLOB:
 +blobDelFuture = executorService.submit(new CallableVoid() {
 +   @Override
 +  public Void call() {
 + blobStore.removeBlob(containerName, fullPath);
 + return null;
 +  }
 +   });
 +break;
 + case FOLDER:
 +if (options.isRecursive()) {
 +   blobDelFuture = executorService.submit(new CallableVoid() {
 +  @Override
 +  public Void call() {
 + blobStore.deleteDirectory(containerName, fullPath);

Previously ```clearContainer``` issued a batch of delete blob operations within 
a subtree before removing the parent but now it can issue delete blob 
operations concurrently with delete parent.  I believe the new logic will 
generate operation failures.  This is generally a hard problem to solve; can we 
simplify this for now by making the recursive call blocking?

---
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/214/files#r10912447

Re: [jclouds] Delete objects in a container efficiently. (#214)

2014-03-23 Thread Shri Javadekar
 + // If a future to delete a blob/directory actually got created 
 above,
 + // keep a reference of that in the outstandingFutures list. This is
 + // useful in case of a timeout exception. All outstanding futures 
 can
 + // then be cancelled.
 + if (blobDelFuture != null) {
 +final AtomicReferenceListenableFutureVoid 
 atomicBlobDelFuture =
 +   new AtomicReferenceListenableFutureVoid(blobDelFuture);
 +outstandingFutures.add(atomicBlobDelFuture);
 +
 +// Add a callback to release the semaphore. This is required for
 +// other threads waiting to acquire a semaphore above to make
 +// progress.
 +Futures.addCallback(blobDelFuture, new FutureCallbackObject() {
 +   @Override
 +   public void onSuccess(final Object o) {
 +  outstandingFutures.remove(atomicBlobDelFuture);

Done.

---
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/214/files#r10872324

Re: [jclouds] Delete objects in a container efficiently. (#214)

2014-03-23 Thread Shri Javadekar
  import com.google.common.util.concurrent.ListenableFuture;
  import com.google.common.util.concurrent.ListeningExecutorService;
  import com.google.inject.Inject;
  
  /**
   * Deletes all keys in the container
 - * 
 + *
   * @author Adrian Cole

Done

---
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/214/files#r10872338

Re: [jclouds] Delete objects in a container efficiently. (#214)

2014-03-23 Thread Shri Javadekar
 +   * a timeout. Also, when the reference is removed from this list and 
 when
 +   * the executorService removes the reference that it has maintained, 
 the
 +   * future will be marked for GC since there should be no other 
 references
 +   * to it. This is important because this code can generate an unbounded
 +   * number of futures.
 +   */
 +  final ListAtomicReferenceListenableFutureVoid outstandingFutures 
 = Collections
 +.synchronizedList(new 
 ArrayListAtomicReferenceListenableFutureVoid());
 +  while (retries  0) {
 + deleteFailure.set(false);
 + executeOneIteration(containerName, listOptions.clone(), semaphore,
 +   outstandingFutures, deleteFailure);
 + // Wait for all futures to complete by waiting to acquire all
 + // semaphores.
 + try {
 +if (!semaphore.tryAcquire(numOutStandingRequests, maxTime,

Added a TODO for this.

---
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/214/files#r10872424

Re: [jclouds] Delete objects in a container efficiently. (#214)

2014-03-23 Thread Shri Javadekar
 + listing = blobStore.list(containerName, options);
 +  } catch (ContainerNotFoundException ce) {
 + return listing;
 +  }
 +
 +  // recurse on subdirectories
 +  if (options.isRecursive()) {
 + for (StorageMetadata md : listing) {
 +String fullPath = parentIsFolder(options, md) ? options.getDir()
 +  + / + md.getName() : md.getName();
 +switch (md.getType()) {
 +case BLOB:
 +   break;
 +case FOLDER:
 +case RELATIVE_PATH:
 +   if (options.isRecursive() 

Removed.

---
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/214/files#r10872428

Re: [jclouds] Delete objects in a container efficiently. (#214)

2014-03-23 Thread Shri Javadekar
 + String marker = listing.getNextMarker();
 + if (marker != null) {
 +logger.debug(%s with marker %s, message, marker);
 +options = options.afterMarker(marker);
 +listing = getListing(containerName, options, semaphore,
 +  outstandingFutures, deleteFailure);
 + } else {
 +break;
 + }
 +  }
 +   }
 +
 +   public void execute(final String containerName,
 + ListContainerOptions listOptions) {
 +  final AtomicBoolean deleteFailure = new AtomicBoolean();
 +  final int numOutStandingRequests = 1024;

Done.

There is a configurable property called jclouds.max-parallel-deletes that can 
be used for this.

You're point about whether the # of semaphores should be equal to the # of 
threads in the executorService is valid. The default value of 
jclouds.max-parallel-threads is equal to the # of user threads.

Given that these two are equal, I thought about whether we need the extra 
config option. Can we just use the value of PROPERTY_USER_THREADS instead. 
However, I'm not sure if the executorService object injected into 
DeleteAllKeysInList is  also being used somewhere else. If it is used 
somewhere, I guess, it would be good to have an option where the use can limit 
the number of parallel deletes being issued when deleting a container thereby 
limiting the number of threads being used from the executorService.

Let me know if this isn't necessary.

---
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/214/files#r10872420

Re: [jclouds] Delete objects in a container efficiently. (#214)

2014-03-23 Thread Shri Javadekar
Looks like a network glitch on cloudbees.

code
ERROR: Error cloning remote repo 'origin'
hudson.plugins.git.GitException: Could not clone 
git://github.com/jclouds/jclouds.git
...
stderr: fatal: unable to connect to github.com:
github.com[0: 192.30.252.129]: errno=Connection timed out
/code

Can we start another build?

---
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/214#issuecomment-38413431

Re: [jclouds] Delete objects in a container efficiently. (#214)

2014-03-21 Thread Andrew Gaul
  import com.google.common.util.concurrent.ListenableFuture;
  import com.google.common.util.concurrent.ListeningExecutorService;
  import com.google.inject.Inject;
  
  /**
   * Deletes all keys in the container
 - * 
 + *
   * @author Adrian Cole

Add your name to since this is a large body of new code?

---
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/214/files#r10853480

Re: [jclouds] Delete objects in a container efficiently. (#214)

2014-03-21 Thread Andrew Gaul
Can you open a JIRA issue and reference it in the commit message?  This will 
allow us to communicate this improvement in the release notes.

---
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/214#issuecomment-38321551

Re: [jclouds] Delete objects in a container efficiently. (#214)

2014-03-21 Thread Andrew Gaul
 +   * acquired in 'maxTime', a TimeoutException is thrown. Any outstanding
 +   * futures at that time are cancelled.
 +   */
 +  final Semaphore semaphore = new Semaphore(numOutStandingRequests);
 +  /*
 +   * When a future is created, a reference for that is added to the
 +   * outstandingFutures list. This reference is removed from the list in 
 the
 +   * FutureCallback since it no longer needs to be cancelled in the 
 event of
 +   * a timeout. Also, when the reference is removed from this list and 
 when
 +   * the executorService removes the reference that it has maintained, 
 the
 +   * future will be marked for GC since there should be no other 
 references
 +   * to it. This is important because this code can generate an unbounded
 +   * number of futures.
 +   */
 +  final ListAtomicReferenceListenableFutureVoid outstandingFutures 
 = Collections
 +.synchronizedList(new 
 ArrayListAtomicReferenceListenableFutureVoid());

Does the AtomicReference provide value that the synchronized wrapper does not?

---
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/214/files#r10853586

Re: [jclouds] Delete objects in a container efficiently. (#214)

2014-03-21 Thread Andrew Gaul
 +   * a timeout. Also, when the reference is removed from this list and 
 when
 +   * the executorService removes the reference that it has maintained, 
 the
 +   * future will be marked for GC since there should be no other 
 references
 +   * to it. This is important because this code can generate an unbounded
 +   * number of futures.
 +   */
 +  final ListAtomicReferenceListenableFutureVoid outstandingFutures 
 = Collections
 +.synchronizedList(new 
 ArrayListAtomicReferenceListenableFutureVoid());
 +  while (retries  0) {
 + deleteFailure.set(false);
 + executeOneIteration(containerName, listOptions.clone(), semaphore,
 +   outstandingFutures, deleteFailure);
 + // Wait for all futures to complete by waiting to acquire all
 + // semaphores.
 + try {
 +if (!semaphore.tryAcquire(numOutStandingRequests, maxTime,

These timeouts provide little value over the per-operation timouts and we may 
want to remove them in a subsequent commit.

---
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/214/files#r10853876

Re: [jclouds] Delete objects in a container efficiently. (#214)

2014-03-21 Thread Andrew Gaul
@shrinandj This commit represents a big improvement and I apologize for my 
delayed comments.  Can you address some of these and add TODOs for the rest so 
we can commit this as soon as possible?  Specifically we must address the O(n) 
behavior and I do not understand some of the synchronization.  Further I have 
some doubts about how the size of the ```ExecutorService``` interacts with the 
```Semaphore```; should the sizes of the two be the same?

---
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/214#issuecomment-38323072

Re: [jclouds] Delete objects in a container efficiently. (#214)

2014-03-21 Thread Andrew Gaul
 + listing = blobStore.list(containerName, options);
 +  } catch (ContainerNotFoundException ce) {
 + return listing;
 +  }
 +
 +  // recurse on subdirectories
 +  if (options.isRecursive()) {
 + for (StorageMetadata md : listing) {
 +String fullPath = parentIsFolder(options, md) ? options.getDir()
 +  + / + md.getName() : md.getName();
 +switch (md.getType()) {
 +case BLOB:
 +   break;
 +case FOLDER:
 +case RELATIVE_PATH:
 +   if (options.isRecursive() 

Repeated test of isRecusive from above.

---
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/214/files#r10856243

Re: [jclouds] Delete objects in a container efficiently. (#214)

2014-03-05 Thread Andrew Phillips
 but haven't been able to reproduce the problem.

OK, let's kick the PR builders again, then. I also suspect it's transient. 
Thanks, @shrinandj!

---
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/214#issuecomment-36737307

Re: [jclouds] Delete objects in a container efficiently. (#214)

2014-03-05 Thread Andrew Phillips
Reopened #214.

---
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/214

Re: [jclouds] Delete objects in a container efficiently. (#214)

2014-03-05 Thread BuildHive
[jclouds » jclouds 
#889](https://buildhive.cloudbees.com/job/jclouds/job/jclouds/889/) SUCCESS
This pull request looks good
[(what's this?)](https://www.cloudbees.com/what-is-buildhive)

---
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/214#issuecomment-36740432

Re: [jclouds] Delete objects in a container efficiently. (#214)

2014-03-05 Thread CloudBees pull request builder plugin
[jclouds-pull-requests 
#639](https://jclouds.ci.cloudbees.com/job/jclouds-pull-requests/639/) SUCCESS
This pull request looks good

---
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/214#issuecomment-36741644

Re: [jclouds] Delete objects in a container efficiently. (#214)

2014-03-05 Thread CloudBees pull request builder plugin
[jclouds-java-7-pull-requests 
#1109](https://jclouds.ci.cloudbees.com/job/jclouds-java-7-pull-requests/1109/) 
SUCCESS
This pull request looks good

---
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/214#issuecomment-36742089

Re: [jclouds] Delete objects in a container efficiently. (#214)

2014-03-05 Thread Andrew Phillips
  I also suspect it's transient.

Bingo ;-)

---
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/214#issuecomment-36742345

Re: [jclouds] Delete objects in a container efficiently. (#214)

2014-03-04 Thread Andrew Phillips
 jclouds » jclouds #882 UNSTABLE

This looks like a real 
[failure](https://buildhive.cloudbees.com/job/jclouds/job/jclouds/org.apache.jclouds$jclouds-blobstore/882/testReport/junit/org.jclouds.blobstore.strategy.internal/DeleteAllKeysInListTest/testExecuteInDirectory/)?
 Could you have a look, @shrinandj?

We can also close'n'reopen this PR to trigger the builders again if we think 
it's likely to be transient...

---
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/214#issuecomment-36670947

Re: [jclouds] Delete objects in a container efficiently. (#214)

2014-03-04 Thread Shri Javadekar
I am trying to reproduce this locally. I have run  50 iterations of this test 
so far, but haven't been able to reproduce the problem.

---
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/214#issuecomment-36681788

Re: [jclouds] Delete objects in a container efficiently. (#214)

2014-03-03 Thread Shri Javadekar
I tried varying the number of threads from 10 to 50 and deleting a container 
with at least 5K blobs. Things worked fine. I'll rebase this patch on top of 
the current ToT and send out.

---
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/214#issuecomment-36594782

Re: [jclouds] Delete objects in a container efficiently. (#214)

2014-02-10 Thread Andrew Gaul
Sorry for ignoring this.  By default jclouds uses an unlimited number of user 
threads, set in ```BaseApiMetadata.defaultProperties```.  I believe this will 
cause problems for users of Atmos blobstores.  We can work around this by 
setting a lower number in ```AtmosApiMetadata.defaultProperties```.  Previously 
I have successfully tested with 8 user and 8 IO threads although this depends 
on the Atmos version and configuration.  @shrinandj will run some further tests 
with the defaults and I will look at a solution to 
[JCLOUDS-137](https://issues.apache.org/jira/browse/JCLOUDS-137).

I also opened [JCLOUDS-459](https://issues.apache.org/jira/browse/JCLOUDS-459) 
to track adding some limit to all blobstores.

---
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/214#issuecomment-34665263