Re: Amazon S3 data transfer for filemanager

2015-03-12 Thread John Reynolds
Thanks Michael, this works for me (filemanager starts), however i’m still 
getting the same datatransferer error

> On Mar 12, 2015, at 1:52 PM, Michael Starch  wrote:
> 
> John,
> 
> I should be more verbose.  The java classpath traditionally did not pick up
> multiple jars, so it was super labor intensive to setup.  These days you
> can use * inside the classpath to pick up multiple jarsbut it must be
> in " " because otherwise the shell will glob the * if it is outside of
> quotes.
> 
> If this doesn't work, try the other recommendations in:
> 
> 
> http://stackoverflow.com/questions/219585/setting-multiple-jars-in-java-classpath
> 
> -Michael
> 
> 
> On Thu, Mar 12, 2015 at 1:48 PM, Michael Starch  wrote:
> 
>> John,
>> 
>> Change:
>>-classpath "$FILEMGR_HOME"/lib \
>> To:
>>-classpath "$FILEMGR_HOME/lib/*" \
>> 
>> -Michael
>> 
>> 
>> On Thu, Mar 12, 2015 at 1:37 PM, John Reynolds 
>> wrote:
>> 
>>> Thanks Michael, if i modify the filemgr-client to look like this (at the
>>> end)
>>> "$_RUNJAVA" $JAVA_OPTS $OODT_OPTS \
>>>  -classpath "$FILEMGR_HOME"/lib \
>>> 
>>> -Dorg.apache.oodt.cas.filemgr.properties="$FILEMGR_HOME"/etc/filemgr.properties
>>> \
>>>  -Djava.util.logging.config.file="$FILEMGR_HOME"/etc/logging.properties \
>>> 
>>> -Dorg.apache.oodt.cas.cli.action.spring.config=file:"$FILEMGR_HOME"/policy/cmd-line-actions.xml
>>> \
>>> 
>>> -Dorg.apache.oodt.cas.cli.option.spring.config=file:"$FILEMGR_HOME"/policy/cmd-line-options.xml
>>> \
>>>  org.apache.oodt.cas.filemgr.system.XmlRpcFileManagerClient "$@“
>>> 
>>> (replacing ext jars with -classpath) then i get
>>> Error: Could not find or load main class
>>> org.apache.oodt.cas.filemgr.system.XmlRpcFileManagerClient
>>> i assume i’m doing something wrong with classpath but not sure what
>>> 
 On Mar 12, 2015, at 11:31 AM, Michael Starch 
>>> wrote:
 
 John,
 
 Can you open filemgr-client sh script?  It may set the JAVA_EXT_JARS
 there.  If so, it is clobbering the default path for "extension" jars,
>>> and
 your java encryption jars are not being picked up. If it does set
 JAVA_EXT_JARS you have two options:
 
 1. Move all your encryption jars into FILEMGR_HOME/lib/
 2. update filemgr-client script to us classpath to specify the jars in
>>> the
 FILEMGRHOME/lib directory and remove the use of JAVA_EXT_JARS
 
 
 -Michael
 
 
 On Thu, Mar 12, 2015 at 11:12 AM, John Reynolds 
>>> wrote:
 
> Hi Michael
> yeah it’s openjdk 1.7 (“1.7.0_75")
> i did download the the unlimited encryption jar from oracle and
>>> replaced
> the local_policy / us_export_policy jars in javahome/jre/lib/security
> more i read, maybe limited by jce.jar
> 
> i dont have anything special set for extension jars
> 
> 
> 
>> On Mar 12, 2015, at 10:35 AM, Michael Starch 
>>> wrote:
>> 
>> John,
>> 
>> What version of the JDK are you running, and what is your extension
>>> jars
>> environment variable set to.  Do you have the java cryptology jar
> included
>> (Oracle JDK usually has this, I don't know if Open JDK does).
>> 
>> "Algorithm HmacSHA1 not available" is usually thrown when Java cannot
> find
>> the java crypto jar used to calculate the given hash.
>> 
>> -Michael
>> 
>> On Thu, Mar 12, 2015 at 9:06 AM, John Reynolds 
> wrote:
>> 
>>> Hi Lewis,
>>> using the latest docker buggtb/oodt image, which i assume is .8
>>> here’s the command i’m running to test the upload
>>> 
>>> filemgr-client --url http://localhost:9000 --operation
>>> --ingestProduct
>>> --productName test --productStructure Flat --productTypeName
>>> GenericFile
>>> --metadataFile file:///root/test.txt.met --refs file:///root/test.txt
>>> 
>>> i verified that i can upload to the path using the s3 tools on the
>>> box /
>>> with same credentials i put in the properties file
>>> 
>>> here’s the full exception returned:
>>> 
>>> rg.apache.oodt.cas.filemgr.structs.exceptions.DataTransferException:
>>> org.apache.oodt.cas.filemgr.structs.exceptions.DataTransferException:
>>> Failed to upload product reference /root/test.txt to S3 at
>>> usr/src/oodt/data/archive/test/test.txt
>>>  at
>>> 
> 
>>> org.apache.oodt.cas.filemgr.system.XmlRpcFileManager.ingestProduct(XmlRpcFileManager.java:768)
>>>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>  at
>>> 
> 
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>  at
>>> 
> 
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>  at java.lang.reflect.Method.invoke(Method.java:606)
>>>  at org.apache.xmlrpc.Invoker.execute(Invoker.java:130)
>>>  at
>>> org.apache.xmlrpc.XmlRpcWorker.invokeHandler(XmlRpcWorker.java:84)
>>>  at
>>> 

Re: Amazon S3 data transfer for filemanager

2015-03-12 Thread Michael Starch
John,

I should be more verbose.  The java classpath traditionally did not pick up
multiple jars, so it was super labor intensive to setup.  These days you
can use * inside the classpath to pick up multiple jarsbut it must be
in " " because otherwise the shell will glob the * if it is outside of
quotes.

If this doesn't work, try the other recommendations in:


http://stackoverflow.com/questions/219585/setting-multiple-jars-in-java-classpath

-Michael


On Thu, Mar 12, 2015 at 1:48 PM, Michael Starch  wrote:

> John,
>
> Change:
> -classpath "$FILEMGR_HOME"/lib \
> To:
> -classpath "$FILEMGR_HOME/lib/*" \
>
> -Michael
>
>
> On Thu, Mar 12, 2015 at 1:37 PM, John Reynolds 
> wrote:
>
>> Thanks Michael, if i modify the filemgr-client to look like this (at the
>> end)
>> "$_RUNJAVA" $JAVA_OPTS $OODT_OPTS \
>>   -classpath "$FILEMGR_HOME"/lib \
>>
>> -Dorg.apache.oodt.cas.filemgr.properties="$FILEMGR_HOME"/etc/filemgr.properties
>> \
>>   -Djava.util.logging.config.file="$FILEMGR_HOME"/etc/logging.properties \
>>
>> -Dorg.apache.oodt.cas.cli.action.spring.config=file:"$FILEMGR_HOME"/policy/cmd-line-actions.xml
>> \
>>
>> -Dorg.apache.oodt.cas.cli.option.spring.config=file:"$FILEMGR_HOME"/policy/cmd-line-options.xml
>> \
>>   org.apache.oodt.cas.filemgr.system.XmlRpcFileManagerClient "$@“
>>
>> (replacing ext jars with -classpath) then i get
>> Error: Could not find or load main class
>> org.apache.oodt.cas.filemgr.system.XmlRpcFileManagerClient
>> i assume i’m doing something wrong with classpath but not sure what
>>
>> > On Mar 12, 2015, at 11:31 AM, Michael Starch 
>> wrote:
>> >
>> > John,
>> >
>> > Can you open filemgr-client sh script?  It may set the JAVA_EXT_JARS
>> > there.  If so, it is clobbering the default path for "extension" jars,
>> and
>> > your java encryption jars are not being picked up. If it does set
>> > JAVA_EXT_JARS you have two options:
>> >
>> > 1. Move all your encryption jars into FILEMGR_HOME/lib/
>> > 2. update filemgr-client script to us classpath to specify the jars in
>> the
>> > FILEMGRHOME/lib directory and remove the use of JAVA_EXT_JARS
>> >
>> >
>> > -Michael
>> >
>> >
>> > On Thu, Mar 12, 2015 at 11:12 AM, John Reynolds 
>> wrote:
>> >
>> >> Hi Michael
>> >> yeah it’s openjdk 1.7 (“1.7.0_75")
>> >> i did download the the unlimited encryption jar from oracle and
>> replaced
>> >> the local_policy / us_export_policy jars in javahome/jre/lib/security
>> >> more i read, maybe limited by jce.jar
>> >>
>> >> i dont have anything special set for extension jars
>> >>
>> >>
>> >>
>> >>> On Mar 12, 2015, at 10:35 AM, Michael Starch 
>> wrote:
>> >>>
>> >>> John,
>> >>>
>> >>> What version of the JDK are you running, and what is your extension
>> jars
>> >>> environment variable set to.  Do you have the java cryptology jar
>> >> included
>> >>> (Oracle JDK usually has this, I don't know if Open JDK does).
>> >>>
>> >>> "Algorithm HmacSHA1 not available" is usually thrown when Java cannot
>> >> find
>> >>> the java crypto jar used to calculate the given hash.
>> >>>
>> >>> -Michael
>> >>>
>> >>> On Thu, Mar 12, 2015 at 9:06 AM, John Reynolds 
>> >> wrote:
>> >>>
>>  Hi Lewis,
>>  using the latest docker buggtb/oodt image, which i assume is .8
>>  here’s the command i’m running to test the upload
>> 
>>  filemgr-client --url http://localhost:9000 --operation
>> --ingestProduct
>>  --productName test --productStructure Flat --productTypeName
>> GenericFile
>>  --metadataFile file:///root/test.txt.met --refs file:///root/test.txt
>> 
>>  i verified that i can upload to the path using the s3 tools on the
>> box /
>>  with same credentials i put in the properties file
>> 
>>  here’s the full exception returned:
>> 
>>  rg.apache.oodt.cas.filemgr.structs.exceptions.DataTransferException:
>>  org.apache.oodt.cas.filemgr.structs.exceptions.DataTransferException:
>>  Failed to upload product reference /root/test.txt to S3 at
>>  usr/src/oodt/data/archive/test/test.txt
>>    at
>> 
>> >>
>> org.apache.oodt.cas.filemgr.system.XmlRpcFileManager.ingestProduct(XmlRpcFileManager.java:768)
>>    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>    at
>> 
>> >>
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>    at
>> 
>> >>
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>    at java.lang.reflect.Method.invoke(Method.java:606)
>>    at org.apache.xmlrpc.Invoker.execute(Invoker.java:130)
>>    at
>>  org.apache.xmlrpc.XmlRpcWorker.invokeHandler(XmlRpcWorker.java:84)
>>    at
>> org.apache.xmlrpc.XmlRpcWorker.execute(XmlRpcWorker.java:146)
>>    at
>> org.apache.xmlrpc.XmlRpcServer.execute(XmlRpcServer.java:139)
>>    at
>> org.apache.xmlrpc.XmlRpcServer.execute(XmlRpcServer.java:125)
>>    at
>> org.apache.xmlrpc.WebServer$Connec

Re: Amazon S3 data transfer for filemanager

2015-03-12 Thread Michael Starch
John,

Change:
-classpath "$FILEMGR_HOME"/lib \
To:
-classpath "$FILEMGR_HOME/lib/*" \

-Michael

On Thu, Mar 12, 2015 at 1:37 PM, John Reynolds  wrote:

> Thanks Michael, if i modify the filemgr-client to look like this (at the
> end)
> "$_RUNJAVA" $JAVA_OPTS $OODT_OPTS \
>   -classpath "$FILEMGR_HOME"/lib \
>
> -Dorg.apache.oodt.cas.filemgr.properties="$FILEMGR_HOME"/etc/filemgr.properties
> \
>   -Djava.util.logging.config.file="$FILEMGR_HOME"/etc/logging.properties \
>
> -Dorg.apache.oodt.cas.cli.action.spring.config=file:"$FILEMGR_HOME"/policy/cmd-line-actions.xml
> \
>
> -Dorg.apache.oodt.cas.cli.option.spring.config=file:"$FILEMGR_HOME"/policy/cmd-line-options.xml
> \
>   org.apache.oodt.cas.filemgr.system.XmlRpcFileManagerClient "$@“
>
> (replacing ext jars with -classpath) then i get
> Error: Could not find or load main class
> org.apache.oodt.cas.filemgr.system.XmlRpcFileManagerClient
> i assume i’m doing something wrong with classpath but not sure what
>
> > On Mar 12, 2015, at 11:31 AM, Michael Starch  wrote:
> >
> > John,
> >
> > Can you open filemgr-client sh script?  It may set the JAVA_EXT_JARS
> > there.  If so, it is clobbering the default path for "extension" jars,
> and
> > your java encryption jars are not being picked up. If it does set
> > JAVA_EXT_JARS you have two options:
> >
> > 1. Move all your encryption jars into FILEMGR_HOME/lib/
> > 2. update filemgr-client script to us classpath to specify the jars in
> the
> > FILEMGRHOME/lib directory and remove the use of JAVA_EXT_JARS
> >
> >
> > -Michael
> >
> >
> > On Thu, Mar 12, 2015 at 11:12 AM, John Reynolds 
> wrote:
> >
> >> Hi Michael
> >> yeah it’s openjdk 1.7 (“1.7.0_75")
> >> i did download the the unlimited encryption jar from oracle and replaced
> >> the local_policy / us_export_policy jars in javahome/jre/lib/security
> >> more i read, maybe limited by jce.jar
> >>
> >> i dont have anything special set for extension jars
> >>
> >>
> >>
> >>> On Mar 12, 2015, at 10:35 AM, Michael Starch 
> wrote:
> >>>
> >>> John,
> >>>
> >>> What version of the JDK are you running, and what is your extension
> jars
> >>> environment variable set to.  Do you have the java cryptology jar
> >> included
> >>> (Oracle JDK usually has this, I don't know if Open JDK does).
> >>>
> >>> "Algorithm HmacSHA1 not available" is usually thrown when Java cannot
> >> find
> >>> the java crypto jar used to calculate the given hash.
> >>>
> >>> -Michael
> >>>
> >>> On Thu, Mar 12, 2015 at 9:06 AM, John Reynolds 
> >> wrote:
> >>>
>  Hi Lewis,
>  using the latest docker buggtb/oodt image, which i assume is .8
>  here’s the command i’m running to test the upload
> 
>  filemgr-client --url http://localhost:9000 --operation
> --ingestProduct
>  --productName test --productStructure Flat --productTypeName
> GenericFile
>  --metadataFile file:///root/test.txt.met --refs file:///root/test.txt
> 
>  i verified that i can upload to the path using the s3 tools on the
> box /
>  with same credentials i put in the properties file
> 
>  here’s the full exception returned:
> 
>  rg.apache.oodt.cas.filemgr.structs.exceptions.DataTransferException:
>  org.apache.oodt.cas.filemgr.structs.exceptions.DataTransferException:
>  Failed to upload product reference /root/test.txt to S3 at
>  usr/src/oodt/data/archive/test/test.txt
>    at
> 
> >>
> org.apache.oodt.cas.filemgr.system.XmlRpcFileManager.ingestProduct(XmlRpcFileManager.java:768)
>    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>    at
> 
> >>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>    at
> 
> >>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>    at java.lang.reflect.Method.invoke(Method.java:606)
>    at org.apache.xmlrpc.Invoker.execute(Invoker.java:130)
>    at
>  org.apache.xmlrpc.XmlRpcWorker.invokeHandler(XmlRpcWorker.java:84)
>    at org.apache.xmlrpc.XmlRpcWorker.execute(XmlRpcWorker.java:146)
>    at org.apache.xmlrpc.XmlRpcServer.execute(XmlRpcServer.java:139)
>    at org.apache.xmlrpc.XmlRpcServer.execute(XmlRpcServer.java:125)
>    at
> org.apache.xmlrpc.WebServer$Connection.run(WebServer.java:761)
>    at org.apache.xmlrpc.WebServer$Runner.run(WebServer.java:642)
>    at java.lang.Thread.run(Thread.java:745)
>  Caused by:
>  org.apache.oodt.cas.filemgr.structs.exceptions.DataTransferException:
>  Failed to upload product reference /root/test.txt to S3 at
>  usr/src/oodt/data/archive/test/test.txt
>    at
> 
> >>
> org.apache.oodt.cas.filemgr.datatransfer.S3DataTransferer.transferProduct(S3DataTransferer.java:78)
>    at
> 
> >>
> org.apache.oodt.cas.filemgr.system.XmlRpcFileManager.ingestProduct(XmlRpcFileManager.java:752)
>    ... 12 more
>  Caused by: com.ama

Re: Amazon S3 data transfer for filemanager

2015-03-12 Thread John Reynolds
Thanks Michael, if i modify the filemgr-client to look like this (at the end)
"$_RUNJAVA" $JAVA_OPTS $OODT_OPTS \
  -classpath "$FILEMGR_HOME"/lib \
  
-Dorg.apache.oodt.cas.filemgr.properties="$FILEMGR_HOME"/etc/filemgr.properties 
\
  -Djava.util.logging.config.file="$FILEMGR_HOME"/etc/logging.properties \
  
-Dorg.apache.oodt.cas.cli.action.spring.config=file:"$FILEMGR_HOME"/policy/cmd-line-actions.xml
 \
  
-Dorg.apache.oodt.cas.cli.option.spring.config=file:"$FILEMGR_HOME"/policy/cmd-line-options.xml
 \
  org.apache.oodt.cas.filemgr.system.XmlRpcFileManagerClient "$@“

(replacing ext jars with -classpath) then i get
Error: Could not find or load main class 
org.apache.oodt.cas.filemgr.system.XmlRpcFileManagerClient
i assume i’m doing something wrong with classpath but not sure what

> On Mar 12, 2015, at 11:31 AM, Michael Starch  wrote:
> 
> John,
> 
> Can you open filemgr-client sh script?  It may set the JAVA_EXT_JARS
> there.  If so, it is clobbering the default path for "extension" jars, and
> your java encryption jars are not being picked up. If it does set
> JAVA_EXT_JARS you have two options:
> 
> 1. Move all your encryption jars into FILEMGR_HOME/lib/
> 2. update filemgr-client script to us classpath to specify the jars in the
> FILEMGRHOME/lib directory and remove the use of JAVA_EXT_JARS
> 
> 
> -Michael
> 
> 
> On Thu, Mar 12, 2015 at 11:12 AM, John Reynolds  wrote:
> 
>> Hi Michael
>> yeah it’s openjdk 1.7 (“1.7.0_75")
>> i did download the the unlimited encryption jar from oracle and replaced
>> the local_policy / us_export_policy jars in javahome/jre/lib/security
>> more i read, maybe limited by jce.jar
>> 
>> i dont have anything special set for extension jars
>> 
>> 
>> 
>>> On Mar 12, 2015, at 10:35 AM, Michael Starch  wrote:
>>> 
>>> John,
>>> 
>>> What version of the JDK are you running, and what is your extension jars
>>> environment variable set to.  Do you have the java cryptology jar
>> included
>>> (Oracle JDK usually has this, I don't know if Open JDK does).
>>> 
>>> "Algorithm HmacSHA1 not available" is usually thrown when Java cannot
>> find
>>> the java crypto jar used to calculate the given hash.
>>> 
>>> -Michael
>>> 
>>> On Thu, Mar 12, 2015 at 9:06 AM, John Reynolds 
>> wrote:
>>> 
 Hi Lewis,
 using the latest docker buggtb/oodt image, which i assume is .8
 here’s the command i’m running to test the upload
 
 filemgr-client --url http://localhost:9000 --operation --ingestProduct
 --productName test --productStructure Flat --productTypeName GenericFile
 --metadataFile file:///root/test.txt.met --refs file:///root/test.txt
 
 i verified that i can upload to the path using the s3 tools on the box /
 with same credentials i put in the properties file
 
 here’s the full exception returned:
 
 rg.apache.oodt.cas.filemgr.structs.exceptions.DataTransferException:
 org.apache.oodt.cas.filemgr.structs.exceptions.DataTransferException:
 Failed to upload product reference /root/test.txt to S3 at
 usr/src/oodt/data/archive/test/test.txt
   at
 
>> org.apache.oodt.cas.filemgr.system.XmlRpcFileManager.ingestProduct(XmlRpcFileManager.java:768)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at
 
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at
 
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at org.apache.xmlrpc.Invoker.execute(Invoker.java:130)
   at
 org.apache.xmlrpc.XmlRpcWorker.invokeHandler(XmlRpcWorker.java:84)
   at org.apache.xmlrpc.XmlRpcWorker.execute(XmlRpcWorker.java:146)
   at org.apache.xmlrpc.XmlRpcServer.execute(XmlRpcServer.java:139)
   at org.apache.xmlrpc.XmlRpcServer.execute(XmlRpcServer.java:125)
   at org.apache.xmlrpc.WebServer$Connection.run(WebServer.java:761)
   at org.apache.xmlrpc.WebServer$Runner.run(WebServer.java:642)
   at java.lang.Thread.run(Thread.java:745)
 Caused by:
 org.apache.oodt.cas.filemgr.structs.exceptions.DataTransferException:
 Failed to upload product reference /root/test.txt to S3 at
 usr/src/oodt/data/archive/test/test.txt
   at
 
>> org.apache.oodt.cas.filemgr.datatransfer.S3DataTransferer.transferProduct(S3DataTransferer.java:78)
   at
 
>> org.apache.oodt.cas.filemgr.system.XmlRpcFileManager.ingestProduct(XmlRpcFileManager.java:752)
   ... 12 more
 Caused by: com.amazonaws.AmazonClientException: Unable to calculate a
 request signature: Unable to calculate a request signature: Algorithm
 HmacSHA1 not available
   at
 
>> com.amazonaws.auth.AbstractAWSSigner.signAndBase64Encode(AbstractAWSSigner.java:71)
   at
 
>> com.amazonaws.auth.AbstractAWSSigner.signAndBase64Encode(AbstractAWSSigner.java:57)
   at

Re: Amazon S3 data transfer for filemanager

2015-03-12 Thread Michael Starch
John,

Can you open filemgr-client sh script?  It may set the JAVA_EXT_JARS
there.  If so, it is clobbering the default path for "extension" jars, and
your java encryption jars are not being picked up. If it does set
JAVA_EXT_JARS you have two options:

1. Move all your encryption jars into FILEMGR_HOME/lib/
2. update filemgr-client script to us classpath to specify the jars in the
FILEMGRHOME/lib directory and remove the use of JAVA_EXT_JARS


-Michael


On Thu, Mar 12, 2015 at 11:12 AM, John Reynolds  wrote:

> Hi Michael
> yeah it’s openjdk 1.7 (“1.7.0_75")
> i did download the the unlimited encryption jar from oracle and replaced
> the local_policy / us_export_policy jars in javahome/jre/lib/security
> more i read, maybe limited by jce.jar
>
> i dont have anything special set for extension jars
>
>
>
> > On Mar 12, 2015, at 10:35 AM, Michael Starch  wrote:
> >
> > John,
> >
> > What version of the JDK are you running, and what is your extension jars
> > environment variable set to.  Do you have the java cryptology jar
> included
> > (Oracle JDK usually has this, I don't know if Open JDK does).
> >
> > "Algorithm HmacSHA1 not available" is usually thrown when Java cannot
> find
> > the java crypto jar used to calculate the given hash.
> >
> > -Michael
> >
> > On Thu, Mar 12, 2015 at 9:06 AM, John Reynolds 
> wrote:
> >
> >> Hi Lewis,
> >> using the latest docker buggtb/oodt image, which i assume is .8
> >> here’s the command i’m running to test the upload
> >>
> >> filemgr-client --url http://localhost:9000 --operation --ingestProduct
> >> --productName test --productStructure Flat --productTypeName GenericFile
> >> --metadataFile file:///root/test.txt.met --refs file:///root/test.txt
> >>
> >> i verified that i can upload to the path using the s3 tools on the box /
> >> with same credentials i put in the properties file
> >>
> >> here’s the full exception returned:
> >>
> >> rg.apache.oodt.cas.filemgr.structs.exceptions.DataTransferException:
> >> org.apache.oodt.cas.filemgr.structs.exceptions.DataTransferException:
> >> Failed to upload product reference /root/test.txt to S3 at
> >> usr/src/oodt/data/archive/test/test.txt
> >>at
> >>
> org.apache.oodt.cas.filemgr.system.XmlRpcFileManager.ingestProduct(XmlRpcFileManager.java:768)
> >>at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >>at
> >>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> >>at
> >>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> >>at java.lang.reflect.Method.invoke(Method.java:606)
> >>at org.apache.xmlrpc.Invoker.execute(Invoker.java:130)
> >>at
> >> org.apache.xmlrpc.XmlRpcWorker.invokeHandler(XmlRpcWorker.java:84)
> >>at org.apache.xmlrpc.XmlRpcWorker.execute(XmlRpcWorker.java:146)
> >>at org.apache.xmlrpc.XmlRpcServer.execute(XmlRpcServer.java:139)
> >>at org.apache.xmlrpc.XmlRpcServer.execute(XmlRpcServer.java:125)
> >>at org.apache.xmlrpc.WebServer$Connection.run(WebServer.java:761)
> >>at org.apache.xmlrpc.WebServer$Runner.run(WebServer.java:642)
> >>at java.lang.Thread.run(Thread.java:745)
> >> Caused by:
> >> org.apache.oodt.cas.filemgr.structs.exceptions.DataTransferException:
> >> Failed to upload product reference /root/test.txt to S3 at
> >> usr/src/oodt/data/archive/test/test.txt
> >>at
> >>
> org.apache.oodt.cas.filemgr.datatransfer.S3DataTransferer.transferProduct(S3DataTransferer.java:78)
> >>at
> >>
> org.apache.oodt.cas.filemgr.system.XmlRpcFileManager.ingestProduct(XmlRpcFileManager.java:752)
> >>... 12 more
> >> Caused by: com.amazonaws.AmazonClientException: Unable to calculate a
> >> request signature: Unable to calculate a request signature: Algorithm
> >> HmacSHA1 not available
> >>at
> >>
> com.amazonaws.auth.AbstractAWSSigner.signAndBase64Encode(AbstractAWSSigner.java:71)
> >>at
> >>
> com.amazonaws.auth.AbstractAWSSigner.signAndBase64Encode(AbstractAWSSigner.java:57)
> >>at
> >> com.amazonaws.services.s3.internal.S3Signer.sign(S3Signer.java:128)
> >>at
> >>
> com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:330)
> >>at
> >> com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:232)
> >>at
> >>
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3528)
> >>at
> >>
> com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1393)
> >>at
> >>
> org.apache.oodt.cas.filemgr.datatransfer.S3DataTransferer.transferProduct(S3DataTransferer.java:76)
> >>... 13 more
> >> Caused by: com.amazonaws.AmazonClientException: Unable to calculate a
> >> request signature: Algorithm HmacSHA1 not available
> >>at
> >> com.amazonaws.auth.AbstractAWSSigner.sign(AbstractAWSSigner.java:90)
> >>at
> >>
> com.amazonaws.auth.AbstractAWSSigner.signAndBase64Encode(Abstract

Re: Amazon S3 data transfer for filemanager

2015-03-12 Thread John Reynolds
Hi Michael
yeah it’s openjdk 1.7 (“1.7.0_75")
i did download the the unlimited encryption jar from oracle and replaced the 
local_policy / us_export_policy jars in javahome/jre/lib/security
more i read, maybe limited by jce.jar

i dont have anything special set for extension jars 



> On Mar 12, 2015, at 10:35 AM, Michael Starch  wrote:
> 
> John,
> 
> What version of the JDK are you running, and what is your extension jars
> environment variable set to.  Do you have the java cryptology jar included
> (Oracle JDK usually has this, I don't know if Open JDK does).
> 
> "Algorithm HmacSHA1 not available" is usually thrown when Java cannot find
> the java crypto jar used to calculate the given hash.
> 
> -Michael
> 
> On Thu, Mar 12, 2015 at 9:06 AM, John Reynolds  wrote:
> 
>> Hi Lewis,
>> using the latest docker buggtb/oodt image, which i assume is .8
>> here’s the command i’m running to test the upload
>> 
>> filemgr-client --url http://localhost:9000 --operation --ingestProduct
>> --productName test --productStructure Flat --productTypeName GenericFile
>> --metadataFile file:///root/test.txt.met --refs file:///root/test.txt
>> 
>> i verified that i can upload to the path using the s3 tools on the box /
>> with same credentials i put in the properties file
>> 
>> here’s the full exception returned:
>> 
>> rg.apache.oodt.cas.filemgr.structs.exceptions.DataTransferException:
>> org.apache.oodt.cas.filemgr.structs.exceptions.DataTransferException:
>> Failed to upload product reference /root/test.txt to S3 at
>> usr/src/oodt/data/archive/test/test.txt
>>at
>> org.apache.oodt.cas.filemgr.system.XmlRpcFileManager.ingestProduct(XmlRpcFileManager.java:768)
>>at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>at java.lang.reflect.Method.invoke(Method.java:606)
>>at org.apache.xmlrpc.Invoker.execute(Invoker.java:130)
>>at
>> org.apache.xmlrpc.XmlRpcWorker.invokeHandler(XmlRpcWorker.java:84)
>>at org.apache.xmlrpc.XmlRpcWorker.execute(XmlRpcWorker.java:146)
>>at org.apache.xmlrpc.XmlRpcServer.execute(XmlRpcServer.java:139)
>>at org.apache.xmlrpc.XmlRpcServer.execute(XmlRpcServer.java:125)
>>at org.apache.xmlrpc.WebServer$Connection.run(WebServer.java:761)
>>at org.apache.xmlrpc.WebServer$Runner.run(WebServer.java:642)
>>at java.lang.Thread.run(Thread.java:745)
>> Caused by:
>> org.apache.oodt.cas.filemgr.structs.exceptions.DataTransferException:
>> Failed to upload product reference /root/test.txt to S3 at
>> usr/src/oodt/data/archive/test/test.txt
>>at
>> org.apache.oodt.cas.filemgr.datatransfer.S3DataTransferer.transferProduct(S3DataTransferer.java:78)
>>at
>> org.apache.oodt.cas.filemgr.system.XmlRpcFileManager.ingestProduct(XmlRpcFileManager.java:752)
>>... 12 more
>> Caused by: com.amazonaws.AmazonClientException: Unable to calculate a
>> request signature: Unable to calculate a request signature: Algorithm
>> HmacSHA1 not available
>>at
>> com.amazonaws.auth.AbstractAWSSigner.signAndBase64Encode(AbstractAWSSigner.java:71)
>>at
>> com.amazonaws.auth.AbstractAWSSigner.signAndBase64Encode(AbstractAWSSigner.java:57)
>>at
>> com.amazonaws.services.s3.internal.S3Signer.sign(S3Signer.java:128)
>>at
>> com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:330)
>>at
>> com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:232)
>>at
>> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3528)
>>at
>> com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1393)
>>at
>> org.apache.oodt.cas.filemgr.datatransfer.S3DataTransferer.transferProduct(S3DataTransferer.java:76)
>>... 13 more
>> Caused by: com.amazonaws.AmazonClientException: Unable to calculate a
>> request signature: Algorithm HmacSHA1 not available
>>at
>> com.amazonaws.auth.AbstractAWSSigner.sign(AbstractAWSSigner.java:90)
>>at
>> com.amazonaws.auth.AbstractAWSSigner.signAndBase64Encode(AbstractAWSSigner.java:68)
>>... 20 more
>> Caused by: java.security.NoSuchAlgorithmException: Algorithm HmacSHA1 not
>> available
>>at javax.crypto.Mac.getInstance(Mac.java:176)
>>at
>> com.amazonaws.auth.AbstractAWSSigner.sign(AbstractAWSSigner.java:86)
>>... 21 more
>> org.apache.xmlrpc.XmlRpcException: java.lang.Exception:
>> org.apache.oodt.cas.filemgr.structs.exceptions.CatalogException: Error
>> ingesting product [org.apache.oodt.cas.filemgr.structs.Product@6454bbe1]
>> : org.apache.oodt.cas.filemgr.structs.exceptions.DataTransferException:
>> Failed to upload product reference /root/test.txt to S3 at
>> usr/src/oodt/data/archive/test/test.txt
>>at
>> org.apache.xmlrpc.Xm

Re: Amazon S3 data transfer for filemanager

2015-03-12 Thread Michael Starch
John,

What version of the JDK are you running, and what is your extension jars
environment variable set to.  Do you have the java cryptology jar included
(Oracle JDK usually has this, I don't know if Open JDK does).

"Algorithm HmacSHA1 not available" is usually thrown when Java cannot find
the java crypto jar used to calculate the given hash.

-Michael

On Thu, Mar 12, 2015 at 9:06 AM, John Reynolds  wrote:

> Hi Lewis,
> using the latest docker buggtb/oodt image, which i assume is .8
> here’s the command i’m running to test the upload
>
> filemgr-client --url http://localhost:9000 --operation --ingestProduct
> --productName test --productStructure Flat --productTypeName GenericFile
> --metadataFile file:///root/test.txt.met --refs file:///root/test.txt
>
> i verified that i can upload to the path using the s3 tools on the box /
> with same credentials i put in the properties file
>
> here’s the full exception returned:
>
> rg.apache.oodt.cas.filemgr.structs.exceptions.DataTransferException:
> org.apache.oodt.cas.filemgr.structs.exceptions.DataTransferException:
> Failed to upload product reference /root/test.txt to S3 at
> usr/src/oodt/data/archive/test/test.txt
> at
> org.apache.oodt.cas.filemgr.system.XmlRpcFileManager.ingestProduct(XmlRpcFileManager.java:768)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.xmlrpc.Invoker.execute(Invoker.java:130)
> at
> org.apache.xmlrpc.XmlRpcWorker.invokeHandler(XmlRpcWorker.java:84)
> at org.apache.xmlrpc.XmlRpcWorker.execute(XmlRpcWorker.java:146)
> at org.apache.xmlrpc.XmlRpcServer.execute(XmlRpcServer.java:139)
> at org.apache.xmlrpc.XmlRpcServer.execute(XmlRpcServer.java:125)
> at org.apache.xmlrpc.WebServer$Connection.run(WebServer.java:761)
> at org.apache.xmlrpc.WebServer$Runner.run(WebServer.java:642)
> at java.lang.Thread.run(Thread.java:745)
> Caused by:
> org.apache.oodt.cas.filemgr.structs.exceptions.DataTransferException:
> Failed to upload product reference /root/test.txt to S3 at
> usr/src/oodt/data/archive/test/test.txt
> at
> org.apache.oodt.cas.filemgr.datatransfer.S3DataTransferer.transferProduct(S3DataTransferer.java:78)
> at
> org.apache.oodt.cas.filemgr.system.XmlRpcFileManager.ingestProduct(XmlRpcFileManager.java:752)
> ... 12 more
> Caused by: com.amazonaws.AmazonClientException: Unable to calculate a
> request signature: Unable to calculate a request signature: Algorithm
> HmacSHA1 not available
> at
> com.amazonaws.auth.AbstractAWSSigner.signAndBase64Encode(AbstractAWSSigner.java:71)
> at
> com.amazonaws.auth.AbstractAWSSigner.signAndBase64Encode(AbstractAWSSigner.java:57)
> at
> com.amazonaws.services.s3.internal.S3Signer.sign(S3Signer.java:128)
> at
> com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:330)
> at
> com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:232)
> at
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3528)
> at
> com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1393)
> at
> org.apache.oodt.cas.filemgr.datatransfer.S3DataTransferer.transferProduct(S3DataTransferer.java:76)
> ... 13 more
> Caused by: com.amazonaws.AmazonClientException: Unable to calculate a
> request signature: Algorithm HmacSHA1 not available
> at
> com.amazonaws.auth.AbstractAWSSigner.sign(AbstractAWSSigner.java:90)
> at
> com.amazonaws.auth.AbstractAWSSigner.signAndBase64Encode(AbstractAWSSigner.java:68)
> ... 20 more
> Caused by: java.security.NoSuchAlgorithmException: Algorithm HmacSHA1 not
> available
> at javax.crypto.Mac.getInstance(Mac.java:176)
> at
> com.amazonaws.auth.AbstractAWSSigner.sign(AbstractAWSSigner.java:86)
> ... 21 more
> org.apache.xmlrpc.XmlRpcException: java.lang.Exception:
> org.apache.oodt.cas.filemgr.structs.exceptions.CatalogException: Error
> ingesting product [org.apache.oodt.cas.filemgr.structs.Product@6454bbe1]
> : org.apache.oodt.cas.filemgr.structs.exceptions.DataTransferException:
> Failed to upload product reference /root/test.txt to S3 at
> usr/src/oodt/data/archive/test/test.txt
> at
> org.apache.xmlrpc.XmlRpcClientResponseProcessor.decodeException(XmlRpcClientResponseProcessor.java:104)
> at
> org.apache.xmlrpc.XmlRpcClientResponseProcessor.decodeResponse(XmlRpcClientResponseProcessor.java:71)
> at
> org.apache.xmlrpc.XmlRpcClientWorker.execute(XmlRpcClientWorker.java:73)
> at org.apache.xmlrpc.XmlRpcClient.execute(XmlRpcClient.java:194)
> at org.apache.xmlrpc.XmlRpcClient.execute(XmlRpcCli

Re: Amazon S3 data transfer for filemanager

2015-03-12 Thread John Reynolds
Hi Lewis,
using the latest docker buggtb/oodt image, which i assume is .8
here’s the command i’m running to test the upload

filemgr-client --url http://localhost:9000 --operation --ingestProduct 
--productName test --productStructure Flat --productTypeName GenericFile 
--metadataFile file:///root/test.txt.met --refs file:///root/test.txt

i verified that i can upload to the path using the s3 tools on the box / with 
same credentials i put in the properties file

here’s the full exception returned:

rg.apache.oodt.cas.filemgr.structs.exceptions.DataTransferException: 
org.apache.oodt.cas.filemgr.structs.exceptions.DataTransferException: Failed to 
upload product reference /root/test.txt to S3 at 
usr/src/oodt/data/archive/test/test.txt
at 
org.apache.oodt.cas.filemgr.system.XmlRpcFileManager.ingestProduct(XmlRpcFileManager.java:768)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.xmlrpc.Invoker.execute(Invoker.java:130)
at org.apache.xmlrpc.XmlRpcWorker.invokeHandler(XmlRpcWorker.java:84)
at org.apache.xmlrpc.XmlRpcWorker.execute(XmlRpcWorker.java:146)
at org.apache.xmlrpc.XmlRpcServer.execute(XmlRpcServer.java:139)
at org.apache.xmlrpc.XmlRpcServer.execute(XmlRpcServer.java:125)
at org.apache.xmlrpc.WebServer$Connection.run(WebServer.java:761)
at org.apache.xmlrpc.WebServer$Runner.run(WebServer.java:642)
at java.lang.Thread.run(Thread.java:745)
Caused by: 
org.apache.oodt.cas.filemgr.structs.exceptions.DataTransferException: Failed to 
upload product reference /root/test.txt to S3 at 
usr/src/oodt/data/archive/test/test.txt
at 
org.apache.oodt.cas.filemgr.datatransfer.S3DataTransferer.transferProduct(S3DataTransferer.java:78)
at 
org.apache.oodt.cas.filemgr.system.XmlRpcFileManager.ingestProduct(XmlRpcFileManager.java:752)
... 12 more
Caused by: com.amazonaws.AmazonClientException: Unable to calculate a request 
signature: Unable to calculate a request signature: Algorithm HmacSHA1 not 
available
at 
com.amazonaws.auth.AbstractAWSSigner.signAndBase64Encode(AbstractAWSSigner.java:71)
at 
com.amazonaws.auth.AbstractAWSSigner.signAndBase64Encode(AbstractAWSSigner.java:57)
at com.amazonaws.services.s3.internal.S3Signer.sign(S3Signer.java:128)
at 
com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:330)
at 
com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:232)
at 
com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3528)
at 
com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1393)
at 
org.apache.oodt.cas.filemgr.datatransfer.S3DataTransferer.transferProduct(S3DataTransferer.java:76)
... 13 more
Caused by: com.amazonaws.AmazonClientException: Unable to calculate a request 
signature: Algorithm HmacSHA1 not available
at com.amazonaws.auth.AbstractAWSSigner.sign(AbstractAWSSigner.java:90)
at 
com.amazonaws.auth.AbstractAWSSigner.signAndBase64Encode(AbstractAWSSigner.java:68)
... 20 more
Caused by: java.security.NoSuchAlgorithmException: Algorithm HmacSHA1 not 
available
at javax.crypto.Mac.getInstance(Mac.java:176)
at com.amazonaws.auth.AbstractAWSSigner.sign(AbstractAWSSigner.java:86)
... 21 more
org.apache.xmlrpc.XmlRpcException: java.lang.Exception: 
org.apache.oodt.cas.filemgr.structs.exceptions.CatalogException: Error 
ingesting product [org.apache.oodt.cas.filemgr.structs.Product@6454bbe1] : 
org.apache.oodt.cas.filemgr.structs.exceptions.DataTransferException: Failed to 
upload product reference /root/test.txt to S3 at 
usr/src/oodt/data/archive/test/test.txt
at 
org.apache.xmlrpc.XmlRpcClientResponseProcessor.decodeException(XmlRpcClientResponseProcessor.java:104)
at 
org.apache.xmlrpc.XmlRpcClientResponseProcessor.decodeResponse(XmlRpcClientResponseProcessor.java:71)
at 
org.apache.xmlrpc.XmlRpcClientWorker.execute(XmlRpcClientWorker.java:73)
at org.apache.xmlrpc.XmlRpcClient.execute(XmlRpcClient.java:194)
at org.apache.xmlrpc.XmlRpcClient.execute(XmlRpcClient.java:185)
at org.apache.xmlrpc.XmlRpcClient.execute(XmlRpcClient.java:178)
at 
org.apache.oodt.cas.filemgr.system.XmlRpcFileManagerClient.ingestProduct(XmlRpcFileManagerClient.java:1198)
at 
org.apache.oodt.cas.filemgr.cli.action.IngestProductCliAction.execute(IngestProductCliAction.java:112)
at 
org.apache.oodt.cas.cli.CmdLineUtility.execute(CmdLineUtility.java:331)
at org.apache.oodt.cas.cli.CmdLineUtility.run(CmdLineUtility.java:187)
at 
org.apache.oodt.cas.filemgr.system.XmlRpcFileManagerClient.

Re: Data processing pipeline workflow management

2015-03-12 Thread Michael Starch
Yes.  Our batch processing back-end for the Resource manager now can take
advantage of the Mesos Cluster manager.

This enables OODT to farm batch processing out to a mesos cluster.

-Michael

On Thu, Mar 12, 2015 at 7:58 AM, BW  wrote:

> Any thoughts on integrating a plug in service with Marathon first then
> layer Mesos on top?
>
> On Wednesday, March 11, 2015, Mattmann, Chris A (3980) <
> chris.a.mattm...@jpl.nasa.gov> wrote:
>
> > Apache OODT now has a workflow plugin that connects to Mesos:
> >
> > http://oodt.apache.org/
> >
> > Cross posting this to dev@oodt.apache.org  so people like
> > Mike Starch can chime in.
> >
> > ++
> > Chris Mattmann, Ph.D.
> > Chief Architect
> > Instrument Software and Science Data Systems Section (398)
> > NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> > Office: 168-519, Mailstop: 168-527
> > Email: chris.a.mattm...@nasa.gov 
> > WWW:  http://sunset.usc.edu/~mattmann/
> > ++
> > Adjunct Associate Professor, Computer Science Department
> > University of Southern California, Los Angeles, CA 90089 USA
> > ++
> >
> >
> >
> >
> >
> >
> > -Original Message-
> > From: Zameer Manji >
> > Reply-To: "d...@aurora.incubator.apache.org "
> > >
> > Date: Wednesday, March 11, 2015 at 3:21 PM
> > To: "d...@aurora.incubator.apache.org " <
> > d...@aurora.incubator.apache.org >
> > Subject: Re: Data processing pipeline workflow management
> >
> > >Hey,
> > >
> > >This is a great question. See my comments inline below.
> > >
> > >On Tue, Mar 10, 2015 at 8:28 AM, Lars Albertsson
> > >>
> > >wrote:
> > >
> > >> We are evaluating Aurora as a workflow management tool for batch
> > >> processing pipelines. We basically need a tool that regularly runs
> > >> batch processes that are connected as producers/consumers of data,
> > >> typically stored in HDFS or S3.
> > >>
> > >> The alternative tools would be Azkaban, Luigi, and Oozie, but I am
> > >> hoping that building something built on Aurora would result in a
> > >> better solution.
> > >>
> > >> Does anyone have experience with building workflows with Aurora? How
> > >> is Twitter handling batch pipelines? Would the approach below make
> > >> sense, or are there better suggestions? Is there anything related to
> > >> this in the roadmap or available inside Twitter only?
> > >>
> > >
> > >As far as I know, you are the first person to consider Aurora for
> workflow
> > >management for batch processing. Currently Twitter does not use Aurora
> for
> > >batch pipelines.
> > >I'm not aware of the specifics of the design, but at Twitter there is an
> > >internal solution for pipelines built upon Hadoop/YARN.
> > >Currently Aurora is designed around being a service scheduler and I'm
> not
> > >aware of any future plans to support workflows or batch computation.
> > >
> > >
> > >> In our case, the batch processes will be a mix of cluster
> > >> computation's with Spark, and single-node computations. We want the
> > >> latter to also be scheduled on a farm, and this is why we are
> > >> attracted to Mesos. In the text below, I'll call each part of a
> > >> pipeline a 'step', in order to avoid confusion with Aurora jobs and
> > >> tasks.
> > >>
> > >> My unordered wishlist is:
> > >> * Data pipelines consist of DAGs, where steps take one or more inputs,
> > >> and generate one or more outputs.
> > >>
> > >> * Independent steps in the DAG execute in parallel, constrained by
> > >> resources.
> > >>
> > >> * Steps can be written in different languages and frameworks, some
> > >> clustered.
> > >>
> > >> * The developer code/test/debug cycle is quick, and all functional
> > >> tests can execute on a laptop.
> > >>
> > >> * Developers can test integrated data pipelines, consisting of
> > >> multiple steps, on laptops.
> > >>
> > >> * Steps and their intputs and outputs are parameterised, e.g. by date.
> > >> A parameterised step is typically independent from other instances of
> > >> the same step, e.g. join one day's impressions log with user
> > >> demographics. In some cases, steps depend on yesterday's results, e.g.
> > >> apply one day's user management operation log to the user dataset from
> > >> the day before.
> > >>
> > >> * Data pipelines are specified in embedded DSL files (e.g. aurora
> > >> files), kept close to the business logic code.
> > >>
> > >> * Batch steps should be started soon after the input files become
> > >> available.
> > >>
> > >> * Steps should gracefully avoid recomputation when output files exist.
> > >>
> > >> * Backfilling a window back in time, e.g. 30 days, should happen
> > >> automatically if some earlier steps have failed, or if output files
> > >> have been deleted manually.
> > >>
> > >> * Continuous deployment in the sense that steps are automatically
> > >> deployed and scheduled after 'git push'.
> > >>
>

Re: Data processing pipeline workflow management

2015-03-12 Thread BW
Any thoughts on integrating a plug in service with Marathon first then
layer Mesos on top?

On Wednesday, March 11, 2015, Mattmann, Chris A (3980) <
chris.a.mattm...@jpl.nasa.gov> wrote:

> Apache OODT now has a workflow plugin that connects to Mesos:
>
> http://oodt.apache.org/
>
> Cross posting this to dev@oodt.apache.org  so people like
> Mike Starch can chime in.
>
> ++
> Chris Mattmann, Ph.D.
> Chief Architect
> Instrument Software and Science Data Systems Section (398)
> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> Office: 168-519, Mailstop: 168-527
> Email: chris.a.mattm...@nasa.gov 
> WWW:  http://sunset.usc.edu/~mattmann/
> ++
> Adjunct Associate Professor, Computer Science Department
> University of Southern California, Los Angeles, CA 90089 USA
> ++
>
>
>
>
>
>
> -Original Message-
> From: Zameer Manji >
> Reply-To: "d...@aurora.incubator.apache.org "
> >
> Date: Wednesday, March 11, 2015 at 3:21 PM
> To: "d...@aurora.incubator.apache.org " <
> d...@aurora.incubator.apache.org >
> Subject: Re: Data processing pipeline workflow management
>
> >Hey,
> >
> >This is a great question. See my comments inline below.
> >
> >On Tue, Mar 10, 2015 at 8:28 AM, Lars Albertsson
> >>
> >wrote:
> >
> >> We are evaluating Aurora as a workflow management tool for batch
> >> processing pipelines. We basically need a tool that regularly runs
> >> batch processes that are connected as producers/consumers of data,
> >> typically stored in HDFS or S3.
> >>
> >> The alternative tools would be Azkaban, Luigi, and Oozie, but I am
> >> hoping that building something built on Aurora would result in a
> >> better solution.
> >>
> >> Does anyone have experience with building workflows with Aurora? How
> >> is Twitter handling batch pipelines? Would the approach below make
> >> sense, or are there better suggestions? Is there anything related to
> >> this in the roadmap or available inside Twitter only?
> >>
> >
> >As far as I know, you are the first person to consider Aurora for workflow
> >management for batch processing. Currently Twitter does not use Aurora for
> >batch pipelines.
> >I'm not aware of the specifics of the design, but at Twitter there is an
> >internal solution for pipelines built upon Hadoop/YARN.
> >Currently Aurora is designed around being a service scheduler and I'm not
> >aware of any future plans to support workflows or batch computation.
> >
> >
> >> In our case, the batch processes will be a mix of cluster
> >> computation's with Spark, and single-node computations. We want the
> >> latter to also be scheduled on a farm, and this is why we are
> >> attracted to Mesos. In the text below, I'll call each part of a
> >> pipeline a 'step', in order to avoid confusion with Aurora jobs and
> >> tasks.
> >>
> >> My unordered wishlist is:
> >> * Data pipelines consist of DAGs, where steps take one or more inputs,
> >> and generate one or more outputs.
> >>
> >> * Independent steps in the DAG execute in parallel, constrained by
> >> resources.
> >>
> >> * Steps can be written in different languages and frameworks, some
> >> clustered.
> >>
> >> * The developer code/test/debug cycle is quick, and all functional
> >> tests can execute on a laptop.
> >>
> >> * Developers can test integrated data pipelines, consisting of
> >> multiple steps, on laptops.
> >>
> >> * Steps and their intputs and outputs are parameterised, e.g. by date.
> >> A parameterised step is typically independent from other instances of
> >> the same step, e.g. join one day's impressions log with user
> >> demographics. In some cases, steps depend on yesterday's results, e.g.
> >> apply one day's user management operation log to the user dataset from
> >> the day before.
> >>
> >> * Data pipelines are specified in embedded DSL files (e.g. aurora
> >> files), kept close to the business logic code.
> >>
> >> * Batch steps should be started soon after the input files become
> >> available.
> >>
> >> * Steps should gracefully avoid recomputation when output files exist.
> >>
> >> * Backfilling a window back in time, e.g. 30 days, should happen
> >> automatically if some earlier steps have failed, or if output files
> >> have been deleted manually.
> >>
> >> * Continuous deployment in the sense that steps are automatically
> >> deployed and scheduled after 'git push'.
> >>
> >> * Step owners can get an overview of step status and history, and
> >> debug step execution, e.g. by accessing log files.
> >>
> >>
> >> I am aware that no framework will give us everything. It is a matter
> >> of how much we need to live without or build ourselves.
> >>
> >
> >Your wishlist looks pretty reasonable for batch computation workflows.
> >
> >I'm not aware of any batch/workflow Mesos framework. If you want some or
> >all of the above features on top of Mes