Re: Amazon S3 data transfer for filemanager
Thanks Michael, this works for me (filemanager starts), however i’m still getting the same datatransferer error > On Mar 12, 2015, at 1:52 PM, Michael Starch wrote: > > John, > > I should be more verbose. The java classpath traditionally did not pick up > multiple jars, so it was super labor intensive to setup. These days you > can use * inside the classpath to pick up multiple jarsbut it must be > in " " because otherwise the shell will glob the * if it is outside of > quotes. > > If this doesn't work, try the other recommendations in: > > > http://stackoverflow.com/questions/219585/setting-multiple-jars-in-java-classpath > > -Michael > > > On Thu, Mar 12, 2015 at 1:48 PM, Michael Starch wrote: > >> John, >> >> Change: >>-classpath "$FILEMGR_HOME"/lib \ >> To: >>-classpath "$FILEMGR_HOME/lib/*" \ >> >> -Michael >> >> >> On Thu, Mar 12, 2015 at 1:37 PM, John Reynolds >> wrote: >> >>> Thanks Michael, if i modify the filemgr-client to look like this (at the >>> end) >>> "$_RUNJAVA" $JAVA_OPTS $OODT_OPTS \ >>> -classpath "$FILEMGR_HOME"/lib \ >>> >>> -Dorg.apache.oodt.cas.filemgr.properties="$FILEMGR_HOME"/etc/filemgr.properties >>> \ >>> -Djava.util.logging.config.file="$FILEMGR_HOME"/etc/logging.properties \ >>> >>> -Dorg.apache.oodt.cas.cli.action.spring.config=file:"$FILEMGR_HOME"/policy/cmd-line-actions.xml >>> \ >>> >>> -Dorg.apache.oodt.cas.cli.option.spring.config=file:"$FILEMGR_HOME"/policy/cmd-line-options.xml >>> \ >>> org.apache.oodt.cas.filemgr.system.XmlRpcFileManagerClient "$@“ >>> >>> (replacing ext jars with -classpath) then i get >>> Error: Could not find or load main class >>> org.apache.oodt.cas.filemgr.system.XmlRpcFileManagerClient >>> i assume i’m doing something wrong with classpath but not sure what >>> On Mar 12, 2015, at 11:31 AM, Michael Starch >>> wrote: John, Can you open filemgr-client sh script? It may set the JAVA_EXT_JARS there. If so, it is clobbering the default path for "extension" jars, >>> and your java encryption jars are not being picked up. If it does set JAVA_EXT_JARS you have two options: 1. Move all your encryption jars into FILEMGR_HOME/lib/ 2. update filemgr-client script to us classpath to specify the jars in >>> the FILEMGRHOME/lib directory and remove the use of JAVA_EXT_JARS -Michael On Thu, Mar 12, 2015 at 11:12 AM, John Reynolds >>> wrote: > Hi Michael > yeah it’s openjdk 1.7 (“1.7.0_75") > i did download the the unlimited encryption jar from oracle and >>> replaced > the local_policy / us_export_policy jars in javahome/jre/lib/security > more i read, maybe limited by jce.jar > > i dont have anything special set for extension jars > > > >> On Mar 12, 2015, at 10:35 AM, Michael Starch >>> wrote: >> >> John, >> >> What version of the JDK are you running, and what is your extension >>> jars >> environment variable set to. Do you have the java cryptology jar > included >> (Oracle JDK usually has this, I don't know if Open JDK does). >> >> "Algorithm HmacSHA1 not available" is usually thrown when Java cannot > find >> the java crypto jar used to calculate the given hash. >> >> -Michael >> >> On Thu, Mar 12, 2015 at 9:06 AM, John Reynolds > wrote: >> >>> Hi Lewis, >>> using the latest docker buggtb/oodt image, which i assume is .8 >>> here’s the command i’m running to test the upload >>> >>> filemgr-client --url http://localhost:9000 --operation >>> --ingestProduct >>> --productName test --productStructure Flat --productTypeName >>> GenericFile >>> --metadataFile file:///root/test.txt.met --refs file:///root/test.txt >>> >>> i verified that i can upload to the path using the s3 tools on the >>> box / >>> with same credentials i put in the properties file >>> >>> here’s the full exception returned: >>> >>> rg.apache.oodt.cas.filemgr.structs.exceptions.DataTransferException: >>> org.apache.oodt.cas.filemgr.structs.exceptions.DataTransferException: >>> Failed to upload product reference /root/test.txt to S3 at >>> usr/src/oodt/data/archive/test/test.txt >>> at >>> > >>> org.apache.oodt.cas.filemgr.system.XmlRpcFileManager.ingestProduct(XmlRpcFileManager.java:768) >>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>> at >>> > >>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >>> at >>> > >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >>> at java.lang.reflect.Method.invoke(Method.java:606) >>> at org.apache.xmlrpc.Invoker.execute(Invoker.java:130) >>> at >>> org.apache.xmlrpc.XmlRpcWorker.invokeHandler(XmlRpcWorker.java:84) >>> at >>>
Re: Amazon S3 data transfer for filemanager
John, I should be more verbose. The java classpath traditionally did not pick up multiple jars, so it was super labor intensive to setup. These days you can use * inside the classpath to pick up multiple jarsbut it must be in " " because otherwise the shell will glob the * if it is outside of quotes. If this doesn't work, try the other recommendations in: http://stackoverflow.com/questions/219585/setting-multiple-jars-in-java-classpath -Michael On Thu, Mar 12, 2015 at 1:48 PM, Michael Starch wrote: > John, > > Change: > -classpath "$FILEMGR_HOME"/lib \ > To: > -classpath "$FILEMGR_HOME/lib/*" \ > > -Michael > > > On Thu, Mar 12, 2015 at 1:37 PM, John Reynolds > wrote: > >> Thanks Michael, if i modify the filemgr-client to look like this (at the >> end) >> "$_RUNJAVA" $JAVA_OPTS $OODT_OPTS \ >> -classpath "$FILEMGR_HOME"/lib \ >> >> -Dorg.apache.oodt.cas.filemgr.properties="$FILEMGR_HOME"/etc/filemgr.properties >> \ >> -Djava.util.logging.config.file="$FILEMGR_HOME"/etc/logging.properties \ >> >> -Dorg.apache.oodt.cas.cli.action.spring.config=file:"$FILEMGR_HOME"/policy/cmd-line-actions.xml >> \ >> >> -Dorg.apache.oodt.cas.cli.option.spring.config=file:"$FILEMGR_HOME"/policy/cmd-line-options.xml >> \ >> org.apache.oodt.cas.filemgr.system.XmlRpcFileManagerClient "$@“ >> >> (replacing ext jars with -classpath) then i get >> Error: Could not find or load main class >> org.apache.oodt.cas.filemgr.system.XmlRpcFileManagerClient >> i assume i’m doing something wrong with classpath but not sure what >> >> > On Mar 12, 2015, at 11:31 AM, Michael Starch >> wrote: >> > >> > John, >> > >> > Can you open filemgr-client sh script? It may set the JAVA_EXT_JARS >> > there. If so, it is clobbering the default path for "extension" jars, >> and >> > your java encryption jars are not being picked up. If it does set >> > JAVA_EXT_JARS you have two options: >> > >> > 1. Move all your encryption jars into FILEMGR_HOME/lib/ >> > 2. update filemgr-client script to us classpath to specify the jars in >> the >> > FILEMGRHOME/lib directory and remove the use of JAVA_EXT_JARS >> > >> > >> > -Michael >> > >> > >> > On Thu, Mar 12, 2015 at 11:12 AM, John Reynolds >> wrote: >> > >> >> Hi Michael >> >> yeah it’s openjdk 1.7 (“1.7.0_75") >> >> i did download the the unlimited encryption jar from oracle and >> replaced >> >> the local_policy / us_export_policy jars in javahome/jre/lib/security >> >> more i read, maybe limited by jce.jar >> >> >> >> i dont have anything special set for extension jars >> >> >> >> >> >> >> >>> On Mar 12, 2015, at 10:35 AM, Michael Starch >> wrote: >> >>> >> >>> John, >> >>> >> >>> What version of the JDK are you running, and what is your extension >> jars >> >>> environment variable set to. Do you have the java cryptology jar >> >> included >> >>> (Oracle JDK usually has this, I don't know if Open JDK does). >> >>> >> >>> "Algorithm HmacSHA1 not available" is usually thrown when Java cannot >> >> find >> >>> the java crypto jar used to calculate the given hash. >> >>> >> >>> -Michael >> >>> >> >>> On Thu, Mar 12, 2015 at 9:06 AM, John Reynolds >> >> wrote: >> >>> >> Hi Lewis, >> using the latest docker buggtb/oodt image, which i assume is .8 >> here’s the command i’m running to test the upload >> >> filemgr-client --url http://localhost:9000 --operation >> --ingestProduct >> --productName test --productStructure Flat --productTypeName >> GenericFile >> --metadataFile file:///root/test.txt.met --refs file:///root/test.txt >> >> i verified that i can upload to the path using the s3 tools on the >> box / >> with same credentials i put in the properties file >> >> here’s the full exception returned: >> >> rg.apache.oodt.cas.filemgr.structs.exceptions.DataTransferException: >> org.apache.oodt.cas.filemgr.structs.exceptions.DataTransferException: >> Failed to upload product reference /root/test.txt to S3 at >> usr/src/oodt/data/archive/test/test.txt >> at >> >> >> >> org.apache.oodt.cas.filemgr.system.XmlRpcFileManager.ingestProduct(XmlRpcFileManager.java:768) >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at >> >> >> >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >> at >> >> >> >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >> at java.lang.reflect.Method.invoke(Method.java:606) >> at org.apache.xmlrpc.Invoker.execute(Invoker.java:130) >> at >> org.apache.xmlrpc.XmlRpcWorker.invokeHandler(XmlRpcWorker.java:84) >> at >> org.apache.xmlrpc.XmlRpcWorker.execute(XmlRpcWorker.java:146) >> at >> org.apache.xmlrpc.XmlRpcServer.execute(XmlRpcServer.java:139) >> at >> org.apache.xmlrpc.XmlRpcServer.execute(XmlRpcServer.java:125) >> at >> org.apache.xmlrpc.WebServer$Connec
Re: Amazon S3 data transfer for filemanager
John, Change: -classpath "$FILEMGR_HOME"/lib \ To: -classpath "$FILEMGR_HOME/lib/*" \ -Michael On Thu, Mar 12, 2015 at 1:37 PM, John Reynolds wrote: > Thanks Michael, if i modify the filemgr-client to look like this (at the > end) > "$_RUNJAVA" $JAVA_OPTS $OODT_OPTS \ > -classpath "$FILEMGR_HOME"/lib \ > > -Dorg.apache.oodt.cas.filemgr.properties="$FILEMGR_HOME"/etc/filemgr.properties > \ > -Djava.util.logging.config.file="$FILEMGR_HOME"/etc/logging.properties \ > > -Dorg.apache.oodt.cas.cli.action.spring.config=file:"$FILEMGR_HOME"/policy/cmd-line-actions.xml > \ > > -Dorg.apache.oodt.cas.cli.option.spring.config=file:"$FILEMGR_HOME"/policy/cmd-line-options.xml > \ > org.apache.oodt.cas.filemgr.system.XmlRpcFileManagerClient "$@“ > > (replacing ext jars with -classpath) then i get > Error: Could not find or load main class > org.apache.oodt.cas.filemgr.system.XmlRpcFileManagerClient > i assume i’m doing something wrong with classpath but not sure what > > > On Mar 12, 2015, at 11:31 AM, Michael Starch wrote: > > > > John, > > > > Can you open filemgr-client sh script? It may set the JAVA_EXT_JARS > > there. If so, it is clobbering the default path for "extension" jars, > and > > your java encryption jars are not being picked up. If it does set > > JAVA_EXT_JARS you have two options: > > > > 1. Move all your encryption jars into FILEMGR_HOME/lib/ > > 2. update filemgr-client script to us classpath to specify the jars in > the > > FILEMGRHOME/lib directory and remove the use of JAVA_EXT_JARS > > > > > > -Michael > > > > > > On Thu, Mar 12, 2015 at 11:12 AM, John Reynolds > wrote: > > > >> Hi Michael > >> yeah it’s openjdk 1.7 (“1.7.0_75") > >> i did download the the unlimited encryption jar from oracle and replaced > >> the local_policy / us_export_policy jars in javahome/jre/lib/security > >> more i read, maybe limited by jce.jar > >> > >> i dont have anything special set for extension jars > >> > >> > >> > >>> On Mar 12, 2015, at 10:35 AM, Michael Starch > wrote: > >>> > >>> John, > >>> > >>> What version of the JDK are you running, and what is your extension > jars > >>> environment variable set to. Do you have the java cryptology jar > >> included > >>> (Oracle JDK usually has this, I don't know if Open JDK does). > >>> > >>> "Algorithm HmacSHA1 not available" is usually thrown when Java cannot > >> find > >>> the java crypto jar used to calculate the given hash. > >>> > >>> -Michael > >>> > >>> On Thu, Mar 12, 2015 at 9:06 AM, John Reynolds > >> wrote: > >>> > Hi Lewis, > using the latest docker buggtb/oodt image, which i assume is .8 > here’s the command i’m running to test the upload > > filemgr-client --url http://localhost:9000 --operation > --ingestProduct > --productName test --productStructure Flat --productTypeName > GenericFile > --metadataFile file:///root/test.txt.met --refs file:///root/test.txt > > i verified that i can upload to the path using the s3 tools on the > box / > with same credentials i put in the properties file > > here’s the full exception returned: > > rg.apache.oodt.cas.filemgr.structs.exceptions.DataTransferException: > org.apache.oodt.cas.filemgr.structs.exceptions.DataTransferException: > Failed to upload product reference /root/test.txt to S3 at > usr/src/oodt/data/archive/test/test.txt > at > > >> > org.apache.oodt.cas.filemgr.system.XmlRpcFileManager.ingestProduct(XmlRpcFileManager.java:768) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > > >> > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > > >> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.xmlrpc.Invoker.execute(Invoker.java:130) > at > org.apache.xmlrpc.XmlRpcWorker.invokeHandler(XmlRpcWorker.java:84) > at org.apache.xmlrpc.XmlRpcWorker.execute(XmlRpcWorker.java:146) > at org.apache.xmlrpc.XmlRpcServer.execute(XmlRpcServer.java:139) > at org.apache.xmlrpc.XmlRpcServer.execute(XmlRpcServer.java:125) > at > org.apache.xmlrpc.WebServer$Connection.run(WebServer.java:761) > at org.apache.xmlrpc.WebServer$Runner.run(WebServer.java:642) > at java.lang.Thread.run(Thread.java:745) > Caused by: > org.apache.oodt.cas.filemgr.structs.exceptions.DataTransferException: > Failed to upload product reference /root/test.txt to S3 at > usr/src/oodt/data/archive/test/test.txt > at > > >> > org.apache.oodt.cas.filemgr.datatransfer.S3DataTransferer.transferProduct(S3DataTransferer.java:78) > at > > >> > org.apache.oodt.cas.filemgr.system.XmlRpcFileManager.ingestProduct(XmlRpcFileManager.java:752) > ... 12 more > Caused by: com.ama
Re: Amazon S3 data transfer for filemanager
Thanks Michael, if i modify the filemgr-client to look like this (at the end) "$_RUNJAVA" $JAVA_OPTS $OODT_OPTS \ -classpath "$FILEMGR_HOME"/lib \ -Dorg.apache.oodt.cas.filemgr.properties="$FILEMGR_HOME"/etc/filemgr.properties \ -Djava.util.logging.config.file="$FILEMGR_HOME"/etc/logging.properties \ -Dorg.apache.oodt.cas.cli.action.spring.config=file:"$FILEMGR_HOME"/policy/cmd-line-actions.xml \ -Dorg.apache.oodt.cas.cli.option.spring.config=file:"$FILEMGR_HOME"/policy/cmd-line-options.xml \ org.apache.oodt.cas.filemgr.system.XmlRpcFileManagerClient "$@“ (replacing ext jars with -classpath) then i get Error: Could not find or load main class org.apache.oodt.cas.filemgr.system.XmlRpcFileManagerClient i assume i’m doing something wrong with classpath but not sure what > On Mar 12, 2015, at 11:31 AM, Michael Starch wrote: > > John, > > Can you open filemgr-client sh script? It may set the JAVA_EXT_JARS > there. If so, it is clobbering the default path for "extension" jars, and > your java encryption jars are not being picked up. If it does set > JAVA_EXT_JARS you have two options: > > 1. Move all your encryption jars into FILEMGR_HOME/lib/ > 2. update filemgr-client script to us classpath to specify the jars in the > FILEMGRHOME/lib directory and remove the use of JAVA_EXT_JARS > > > -Michael > > > On Thu, Mar 12, 2015 at 11:12 AM, John Reynolds wrote: > >> Hi Michael >> yeah it’s openjdk 1.7 (“1.7.0_75") >> i did download the the unlimited encryption jar from oracle and replaced >> the local_policy / us_export_policy jars in javahome/jre/lib/security >> more i read, maybe limited by jce.jar >> >> i dont have anything special set for extension jars >> >> >> >>> On Mar 12, 2015, at 10:35 AM, Michael Starch wrote: >>> >>> John, >>> >>> What version of the JDK are you running, and what is your extension jars >>> environment variable set to. Do you have the java cryptology jar >> included >>> (Oracle JDK usually has this, I don't know if Open JDK does). >>> >>> "Algorithm HmacSHA1 not available" is usually thrown when Java cannot >> find >>> the java crypto jar used to calculate the given hash. >>> >>> -Michael >>> >>> On Thu, Mar 12, 2015 at 9:06 AM, John Reynolds >> wrote: >>> Hi Lewis, using the latest docker buggtb/oodt image, which i assume is .8 here’s the command i’m running to test the upload filemgr-client --url http://localhost:9000 --operation --ingestProduct --productName test --productStructure Flat --productTypeName GenericFile --metadataFile file:///root/test.txt.met --refs file:///root/test.txt i verified that i can upload to the path using the s3 tools on the box / with same credentials i put in the properties file here’s the full exception returned: rg.apache.oodt.cas.filemgr.structs.exceptions.DataTransferException: org.apache.oodt.cas.filemgr.structs.exceptions.DataTransferException: Failed to upload product reference /root/test.txt to S3 at usr/src/oodt/data/archive/test/test.txt at >> org.apache.oodt.cas.filemgr.system.XmlRpcFileManager.ingestProduct(XmlRpcFileManager.java:768) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.xmlrpc.Invoker.execute(Invoker.java:130) at org.apache.xmlrpc.XmlRpcWorker.invokeHandler(XmlRpcWorker.java:84) at org.apache.xmlrpc.XmlRpcWorker.execute(XmlRpcWorker.java:146) at org.apache.xmlrpc.XmlRpcServer.execute(XmlRpcServer.java:139) at org.apache.xmlrpc.XmlRpcServer.execute(XmlRpcServer.java:125) at org.apache.xmlrpc.WebServer$Connection.run(WebServer.java:761) at org.apache.xmlrpc.WebServer$Runner.run(WebServer.java:642) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.oodt.cas.filemgr.structs.exceptions.DataTransferException: Failed to upload product reference /root/test.txt to S3 at usr/src/oodt/data/archive/test/test.txt at >> org.apache.oodt.cas.filemgr.datatransfer.S3DataTransferer.transferProduct(S3DataTransferer.java:78) at >> org.apache.oodt.cas.filemgr.system.XmlRpcFileManager.ingestProduct(XmlRpcFileManager.java:752) ... 12 more Caused by: com.amazonaws.AmazonClientException: Unable to calculate a request signature: Unable to calculate a request signature: Algorithm HmacSHA1 not available at >> com.amazonaws.auth.AbstractAWSSigner.signAndBase64Encode(AbstractAWSSigner.java:71) at >> com.amazonaws.auth.AbstractAWSSigner.signAndBase64Encode(AbstractAWSSigner.java:57) at
Re: Amazon S3 data transfer for filemanager
John, Can you open filemgr-client sh script? It may set the JAVA_EXT_JARS there. If so, it is clobbering the default path for "extension" jars, and your java encryption jars are not being picked up. If it does set JAVA_EXT_JARS you have two options: 1. Move all your encryption jars into FILEMGR_HOME/lib/ 2. update filemgr-client script to us classpath to specify the jars in the FILEMGRHOME/lib directory and remove the use of JAVA_EXT_JARS -Michael On Thu, Mar 12, 2015 at 11:12 AM, John Reynolds wrote: > Hi Michael > yeah it’s openjdk 1.7 (“1.7.0_75") > i did download the the unlimited encryption jar from oracle and replaced > the local_policy / us_export_policy jars in javahome/jre/lib/security > more i read, maybe limited by jce.jar > > i dont have anything special set for extension jars > > > > > On Mar 12, 2015, at 10:35 AM, Michael Starch wrote: > > > > John, > > > > What version of the JDK are you running, and what is your extension jars > > environment variable set to. Do you have the java cryptology jar > included > > (Oracle JDK usually has this, I don't know if Open JDK does). > > > > "Algorithm HmacSHA1 not available" is usually thrown when Java cannot > find > > the java crypto jar used to calculate the given hash. > > > > -Michael > > > > On Thu, Mar 12, 2015 at 9:06 AM, John Reynolds > wrote: > > > >> Hi Lewis, > >> using the latest docker buggtb/oodt image, which i assume is .8 > >> here’s the command i’m running to test the upload > >> > >> filemgr-client --url http://localhost:9000 --operation --ingestProduct > >> --productName test --productStructure Flat --productTypeName GenericFile > >> --metadataFile file:///root/test.txt.met --refs file:///root/test.txt > >> > >> i verified that i can upload to the path using the s3 tools on the box / > >> with same credentials i put in the properties file > >> > >> here’s the full exception returned: > >> > >> rg.apache.oodt.cas.filemgr.structs.exceptions.DataTransferException: > >> org.apache.oodt.cas.filemgr.structs.exceptions.DataTransferException: > >> Failed to upload product reference /root/test.txt to S3 at > >> usr/src/oodt/data/archive/test/test.txt > >>at > >> > org.apache.oodt.cas.filemgr.system.XmlRpcFileManager.ingestProduct(XmlRpcFileManager.java:768) > >>at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > >>at > >> > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > >>at > >> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > >>at java.lang.reflect.Method.invoke(Method.java:606) > >>at org.apache.xmlrpc.Invoker.execute(Invoker.java:130) > >>at > >> org.apache.xmlrpc.XmlRpcWorker.invokeHandler(XmlRpcWorker.java:84) > >>at org.apache.xmlrpc.XmlRpcWorker.execute(XmlRpcWorker.java:146) > >>at org.apache.xmlrpc.XmlRpcServer.execute(XmlRpcServer.java:139) > >>at org.apache.xmlrpc.XmlRpcServer.execute(XmlRpcServer.java:125) > >>at org.apache.xmlrpc.WebServer$Connection.run(WebServer.java:761) > >>at org.apache.xmlrpc.WebServer$Runner.run(WebServer.java:642) > >>at java.lang.Thread.run(Thread.java:745) > >> Caused by: > >> org.apache.oodt.cas.filemgr.structs.exceptions.DataTransferException: > >> Failed to upload product reference /root/test.txt to S3 at > >> usr/src/oodt/data/archive/test/test.txt > >>at > >> > org.apache.oodt.cas.filemgr.datatransfer.S3DataTransferer.transferProduct(S3DataTransferer.java:78) > >>at > >> > org.apache.oodt.cas.filemgr.system.XmlRpcFileManager.ingestProduct(XmlRpcFileManager.java:752) > >>... 12 more > >> Caused by: com.amazonaws.AmazonClientException: Unable to calculate a > >> request signature: Unable to calculate a request signature: Algorithm > >> HmacSHA1 not available > >>at > >> > com.amazonaws.auth.AbstractAWSSigner.signAndBase64Encode(AbstractAWSSigner.java:71) > >>at > >> > com.amazonaws.auth.AbstractAWSSigner.signAndBase64Encode(AbstractAWSSigner.java:57) > >>at > >> com.amazonaws.services.s3.internal.S3Signer.sign(S3Signer.java:128) > >>at > >> > com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:330) > >>at > >> com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:232) > >>at > >> > com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3528) > >>at > >> > com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1393) > >>at > >> > org.apache.oodt.cas.filemgr.datatransfer.S3DataTransferer.transferProduct(S3DataTransferer.java:76) > >>... 13 more > >> Caused by: com.amazonaws.AmazonClientException: Unable to calculate a > >> request signature: Algorithm HmacSHA1 not available > >>at > >> com.amazonaws.auth.AbstractAWSSigner.sign(AbstractAWSSigner.java:90) > >>at > >> > com.amazonaws.auth.AbstractAWSSigner.signAndBase64Encode(Abstract
Re: Amazon S3 data transfer for filemanager
Hi Michael yeah it’s openjdk 1.7 (“1.7.0_75") i did download the the unlimited encryption jar from oracle and replaced the local_policy / us_export_policy jars in javahome/jre/lib/security more i read, maybe limited by jce.jar i dont have anything special set for extension jars > On Mar 12, 2015, at 10:35 AM, Michael Starch wrote: > > John, > > What version of the JDK are you running, and what is your extension jars > environment variable set to. Do you have the java cryptology jar included > (Oracle JDK usually has this, I don't know if Open JDK does). > > "Algorithm HmacSHA1 not available" is usually thrown when Java cannot find > the java crypto jar used to calculate the given hash. > > -Michael > > On Thu, Mar 12, 2015 at 9:06 AM, John Reynolds wrote: > >> Hi Lewis, >> using the latest docker buggtb/oodt image, which i assume is .8 >> here’s the command i’m running to test the upload >> >> filemgr-client --url http://localhost:9000 --operation --ingestProduct >> --productName test --productStructure Flat --productTypeName GenericFile >> --metadataFile file:///root/test.txt.met --refs file:///root/test.txt >> >> i verified that i can upload to the path using the s3 tools on the box / >> with same credentials i put in the properties file >> >> here’s the full exception returned: >> >> rg.apache.oodt.cas.filemgr.structs.exceptions.DataTransferException: >> org.apache.oodt.cas.filemgr.structs.exceptions.DataTransferException: >> Failed to upload product reference /root/test.txt to S3 at >> usr/src/oodt/data/archive/test/test.txt >>at >> org.apache.oodt.cas.filemgr.system.XmlRpcFileManager.ingestProduct(XmlRpcFileManager.java:768) >>at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>at >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >>at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >>at java.lang.reflect.Method.invoke(Method.java:606) >>at org.apache.xmlrpc.Invoker.execute(Invoker.java:130) >>at >> org.apache.xmlrpc.XmlRpcWorker.invokeHandler(XmlRpcWorker.java:84) >>at org.apache.xmlrpc.XmlRpcWorker.execute(XmlRpcWorker.java:146) >>at org.apache.xmlrpc.XmlRpcServer.execute(XmlRpcServer.java:139) >>at org.apache.xmlrpc.XmlRpcServer.execute(XmlRpcServer.java:125) >>at org.apache.xmlrpc.WebServer$Connection.run(WebServer.java:761) >>at org.apache.xmlrpc.WebServer$Runner.run(WebServer.java:642) >>at java.lang.Thread.run(Thread.java:745) >> Caused by: >> org.apache.oodt.cas.filemgr.structs.exceptions.DataTransferException: >> Failed to upload product reference /root/test.txt to S3 at >> usr/src/oodt/data/archive/test/test.txt >>at >> org.apache.oodt.cas.filemgr.datatransfer.S3DataTransferer.transferProduct(S3DataTransferer.java:78) >>at >> org.apache.oodt.cas.filemgr.system.XmlRpcFileManager.ingestProduct(XmlRpcFileManager.java:752) >>... 12 more >> Caused by: com.amazonaws.AmazonClientException: Unable to calculate a >> request signature: Unable to calculate a request signature: Algorithm >> HmacSHA1 not available >>at >> com.amazonaws.auth.AbstractAWSSigner.signAndBase64Encode(AbstractAWSSigner.java:71) >>at >> com.amazonaws.auth.AbstractAWSSigner.signAndBase64Encode(AbstractAWSSigner.java:57) >>at >> com.amazonaws.services.s3.internal.S3Signer.sign(S3Signer.java:128) >>at >> com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:330) >>at >> com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:232) >>at >> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3528) >>at >> com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1393) >>at >> org.apache.oodt.cas.filemgr.datatransfer.S3DataTransferer.transferProduct(S3DataTransferer.java:76) >>... 13 more >> Caused by: com.amazonaws.AmazonClientException: Unable to calculate a >> request signature: Algorithm HmacSHA1 not available >>at >> com.amazonaws.auth.AbstractAWSSigner.sign(AbstractAWSSigner.java:90) >>at >> com.amazonaws.auth.AbstractAWSSigner.signAndBase64Encode(AbstractAWSSigner.java:68) >>... 20 more >> Caused by: java.security.NoSuchAlgorithmException: Algorithm HmacSHA1 not >> available >>at javax.crypto.Mac.getInstance(Mac.java:176) >>at >> com.amazonaws.auth.AbstractAWSSigner.sign(AbstractAWSSigner.java:86) >>... 21 more >> org.apache.xmlrpc.XmlRpcException: java.lang.Exception: >> org.apache.oodt.cas.filemgr.structs.exceptions.CatalogException: Error >> ingesting product [org.apache.oodt.cas.filemgr.structs.Product@6454bbe1] >> : org.apache.oodt.cas.filemgr.structs.exceptions.DataTransferException: >> Failed to upload product reference /root/test.txt to S3 at >> usr/src/oodt/data/archive/test/test.txt >>at >> org.apache.xmlrpc.Xm
Re: Amazon S3 data transfer for filemanager
John, What version of the JDK are you running, and what is your extension jars environment variable set to. Do you have the java cryptology jar included (Oracle JDK usually has this, I don't know if Open JDK does). "Algorithm HmacSHA1 not available" is usually thrown when Java cannot find the java crypto jar used to calculate the given hash. -Michael On Thu, Mar 12, 2015 at 9:06 AM, John Reynolds wrote: > Hi Lewis, > using the latest docker buggtb/oodt image, which i assume is .8 > here’s the command i’m running to test the upload > > filemgr-client --url http://localhost:9000 --operation --ingestProduct > --productName test --productStructure Flat --productTypeName GenericFile > --metadataFile file:///root/test.txt.met --refs file:///root/test.txt > > i verified that i can upload to the path using the s3 tools on the box / > with same credentials i put in the properties file > > here’s the full exception returned: > > rg.apache.oodt.cas.filemgr.structs.exceptions.DataTransferException: > org.apache.oodt.cas.filemgr.structs.exceptions.DataTransferException: > Failed to upload product reference /root/test.txt to S3 at > usr/src/oodt/data/archive/test/test.txt > at > org.apache.oodt.cas.filemgr.system.XmlRpcFileManager.ingestProduct(XmlRpcFileManager.java:768) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.xmlrpc.Invoker.execute(Invoker.java:130) > at > org.apache.xmlrpc.XmlRpcWorker.invokeHandler(XmlRpcWorker.java:84) > at org.apache.xmlrpc.XmlRpcWorker.execute(XmlRpcWorker.java:146) > at org.apache.xmlrpc.XmlRpcServer.execute(XmlRpcServer.java:139) > at org.apache.xmlrpc.XmlRpcServer.execute(XmlRpcServer.java:125) > at org.apache.xmlrpc.WebServer$Connection.run(WebServer.java:761) > at org.apache.xmlrpc.WebServer$Runner.run(WebServer.java:642) > at java.lang.Thread.run(Thread.java:745) > Caused by: > org.apache.oodt.cas.filemgr.structs.exceptions.DataTransferException: > Failed to upload product reference /root/test.txt to S3 at > usr/src/oodt/data/archive/test/test.txt > at > org.apache.oodt.cas.filemgr.datatransfer.S3DataTransferer.transferProduct(S3DataTransferer.java:78) > at > org.apache.oodt.cas.filemgr.system.XmlRpcFileManager.ingestProduct(XmlRpcFileManager.java:752) > ... 12 more > Caused by: com.amazonaws.AmazonClientException: Unable to calculate a > request signature: Unable to calculate a request signature: Algorithm > HmacSHA1 not available > at > com.amazonaws.auth.AbstractAWSSigner.signAndBase64Encode(AbstractAWSSigner.java:71) > at > com.amazonaws.auth.AbstractAWSSigner.signAndBase64Encode(AbstractAWSSigner.java:57) > at > com.amazonaws.services.s3.internal.S3Signer.sign(S3Signer.java:128) > at > com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:330) > at > com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:232) > at > com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3528) > at > com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1393) > at > org.apache.oodt.cas.filemgr.datatransfer.S3DataTransferer.transferProduct(S3DataTransferer.java:76) > ... 13 more > Caused by: com.amazonaws.AmazonClientException: Unable to calculate a > request signature: Algorithm HmacSHA1 not available > at > com.amazonaws.auth.AbstractAWSSigner.sign(AbstractAWSSigner.java:90) > at > com.amazonaws.auth.AbstractAWSSigner.signAndBase64Encode(AbstractAWSSigner.java:68) > ... 20 more > Caused by: java.security.NoSuchAlgorithmException: Algorithm HmacSHA1 not > available > at javax.crypto.Mac.getInstance(Mac.java:176) > at > com.amazonaws.auth.AbstractAWSSigner.sign(AbstractAWSSigner.java:86) > ... 21 more > org.apache.xmlrpc.XmlRpcException: java.lang.Exception: > org.apache.oodt.cas.filemgr.structs.exceptions.CatalogException: Error > ingesting product [org.apache.oodt.cas.filemgr.structs.Product@6454bbe1] > : org.apache.oodt.cas.filemgr.structs.exceptions.DataTransferException: > Failed to upload product reference /root/test.txt to S3 at > usr/src/oodt/data/archive/test/test.txt > at > org.apache.xmlrpc.XmlRpcClientResponseProcessor.decodeException(XmlRpcClientResponseProcessor.java:104) > at > org.apache.xmlrpc.XmlRpcClientResponseProcessor.decodeResponse(XmlRpcClientResponseProcessor.java:71) > at > org.apache.xmlrpc.XmlRpcClientWorker.execute(XmlRpcClientWorker.java:73) > at org.apache.xmlrpc.XmlRpcClient.execute(XmlRpcClient.java:194) > at org.apache.xmlrpc.XmlRpcClient.execute(XmlRpcCli
Re: Amazon S3 data transfer for filemanager
Hi Lewis, using the latest docker buggtb/oodt image, which i assume is .8 here’s the command i’m running to test the upload filemgr-client --url http://localhost:9000 --operation --ingestProduct --productName test --productStructure Flat --productTypeName GenericFile --metadataFile file:///root/test.txt.met --refs file:///root/test.txt i verified that i can upload to the path using the s3 tools on the box / with same credentials i put in the properties file here’s the full exception returned: rg.apache.oodt.cas.filemgr.structs.exceptions.DataTransferException: org.apache.oodt.cas.filemgr.structs.exceptions.DataTransferException: Failed to upload product reference /root/test.txt to S3 at usr/src/oodt/data/archive/test/test.txt at org.apache.oodt.cas.filemgr.system.XmlRpcFileManager.ingestProduct(XmlRpcFileManager.java:768) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.xmlrpc.Invoker.execute(Invoker.java:130) at org.apache.xmlrpc.XmlRpcWorker.invokeHandler(XmlRpcWorker.java:84) at org.apache.xmlrpc.XmlRpcWorker.execute(XmlRpcWorker.java:146) at org.apache.xmlrpc.XmlRpcServer.execute(XmlRpcServer.java:139) at org.apache.xmlrpc.XmlRpcServer.execute(XmlRpcServer.java:125) at org.apache.xmlrpc.WebServer$Connection.run(WebServer.java:761) at org.apache.xmlrpc.WebServer$Runner.run(WebServer.java:642) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.oodt.cas.filemgr.structs.exceptions.DataTransferException: Failed to upload product reference /root/test.txt to S3 at usr/src/oodt/data/archive/test/test.txt at org.apache.oodt.cas.filemgr.datatransfer.S3DataTransferer.transferProduct(S3DataTransferer.java:78) at org.apache.oodt.cas.filemgr.system.XmlRpcFileManager.ingestProduct(XmlRpcFileManager.java:752) ... 12 more Caused by: com.amazonaws.AmazonClientException: Unable to calculate a request signature: Unable to calculate a request signature: Algorithm HmacSHA1 not available at com.amazonaws.auth.AbstractAWSSigner.signAndBase64Encode(AbstractAWSSigner.java:71) at com.amazonaws.auth.AbstractAWSSigner.signAndBase64Encode(AbstractAWSSigner.java:57) at com.amazonaws.services.s3.internal.S3Signer.sign(S3Signer.java:128) at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:330) at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:232) at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3528) at com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1393) at org.apache.oodt.cas.filemgr.datatransfer.S3DataTransferer.transferProduct(S3DataTransferer.java:76) ... 13 more Caused by: com.amazonaws.AmazonClientException: Unable to calculate a request signature: Algorithm HmacSHA1 not available at com.amazonaws.auth.AbstractAWSSigner.sign(AbstractAWSSigner.java:90) at com.amazonaws.auth.AbstractAWSSigner.signAndBase64Encode(AbstractAWSSigner.java:68) ... 20 more Caused by: java.security.NoSuchAlgorithmException: Algorithm HmacSHA1 not available at javax.crypto.Mac.getInstance(Mac.java:176) at com.amazonaws.auth.AbstractAWSSigner.sign(AbstractAWSSigner.java:86) ... 21 more org.apache.xmlrpc.XmlRpcException: java.lang.Exception: org.apache.oodt.cas.filemgr.structs.exceptions.CatalogException: Error ingesting product [org.apache.oodt.cas.filemgr.structs.Product@6454bbe1] : org.apache.oodt.cas.filemgr.structs.exceptions.DataTransferException: Failed to upload product reference /root/test.txt to S3 at usr/src/oodt/data/archive/test/test.txt at org.apache.xmlrpc.XmlRpcClientResponseProcessor.decodeException(XmlRpcClientResponseProcessor.java:104) at org.apache.xmlrpc.XmlRpcClientResponseProcessor.decodeResponse(XmlRpcClientResponseProcessor.java:71) at org.apache.xmlrpc.XmlRpcClientWorker.execute(XmlRpcClientWorker.java:73) at org.apache.xmlrpc.XmlRpcClient.execute(XmlRpcClient.java:194) at org.apache.xmlrpc.XmlRpcClient.execute(XmlRpcClient.java:185) at org.apache.xmlrpc.XmlRpcClient.execute(XmlRpcClient.java:178) at org.apache.oodt.cas.filemgr.system.XmlRpcFileManagerClient.ingestProduct(XmlRpcFileManagerClient.java:1198) at org.apache.oodt.cas.filemgr.cli.action.IngestProductCliAction.execute(IngestProductCliAction.java:112) at org.apache.oodt.cas.cli.CmdLineUtility.execute(CmdLineUtility.java:331) at org.apache.oodt.cas.cli.CmdLineUtility.run(CmdLineUtility.java:187) at org.apache.oodt.cas.filemgr.system.XmlRpcFileManagerClient.
Re: Data processing pipeline workflow management
Yes. Our batch processing back-end for the Resource manager now can take advantage of the Mesos Cluster manager. This enables OODT to farm batch processing out to a mesos cluster. -Michael On Thu, Mar 12, 2015 at 7:58 AM, BW wrote: > Any thoughts on integrating a plug in service with Marathon first then > layer Mesos on top? > > On Wednesday, March 11, 2015, Mattmann, Chris A (3980) < > chris.a.mattm...@jpl.nasa.gov> wrote: > > > Apache OODT now has a workflow plugin that connects to Mesos: > > > > http://oodt.apache.org/ > > > > Cross posting this to dev@oodt.apache.org so people like > > Mike Starch can chime in. > > > > ++ > > Chris Mattmann, Ph.D. > > Chief Architect > > Instrument Software and Science Data Systems Section (398) > > NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA > > Office: 168-519, Mailstop: 168-527 > > Email: chris.a.mattm...@nasa.gov > > WWW: http://sunset.usc.edu/~mattmann/ > > ++ > > Adjunct Associate Professor, Computer Science Department > > University of Southern California, Los Angeles, CA 90089 USA > > ++ > > > > > > > > > > > > > > -Original Message- > > From: Zameer Manji > > > Reply-To: "d...@aurora.incubator.apache.org " > > > > > Date: Wednesday, March 11, 2015 at 3:21 PM > > To: "d...@aurora.incubator.apache.org " < > > d...@aurora.incubator.apache.org > > > Subject: Re: Data processing pipeline workflow management > > > > >Hey, > > > > > >This is a great question. See my comments inline below. > > > > > >On Tue, Mar 10, 2015 at 8:28 AM, Lars Albertsson > > >> > > >wrote: > > > > > >> We are evaluating Aurora as a workflow management tool for batch > > >> processing pipelines. We basically need a tool that regularly runs > > >> batch processes that are connected as producers/consumers of data, > > >> typically stored in HDFS or S3. > > >> > > >> The alternative tools would be Azkaban, Luigi, and Oozie, but I am > > >> hoping that building something built on Aurora would result in a > > >> better solution. > > >> > > >> Does anyone have experience with building workflows with Aurora? How > > >> is Twitter handling batch pipelines? Would the approach below make > > >> sense, or are there better suggestions? Is there anything related to > > >> this in the roadmap or available inside Twitter only? > > >> > > > > > >As far as I know, you are the first person to consider Aurora for > workflow > > >management for batch processing. Currently Twitter does not use Aurora > for > > >batch pipelines. > > >I'm not aware of the specifics of the design, but at Twitter there is an > > >internal solution for pipelines built upon Hadoop/YARN. > > >Currently Aurora is designed around being a service scheduler and I'm > not > > >aware of any future plans to support workflows or batch computation. > > > > > > > > >> In our case, the batch processes will be a mix of cluster > > >> computation's with Spark, and single-node computations. We want the > > >> latter to also be scheduled on a farm, and this is why we are > > >> attracted to Mesos. In the text below, I'll call each part of a > > >> pipeline a 'step', in order to avoid confusion with Aurora jobs and > > >> tasks. > > >> > > >> My unordered wishlist is: > > >> * Data pipelines consist of DAGs, where steps take one or more inputs, > > >> and generate one or more outputs. > > >> > > >> * Independent steps in the DAG execute in parallel, constrained by > > >> resources. > > >> > > >> * Steps can be written in different languages and frameworks, some > > >> clustered. > > >> > > >> * The developer code/test/debug cycle is quick, and all functional > > >> tests can execute on a laptop. > > >> > > >> * Developers can test integrated data pipelines, consisting of > > >> multiple steps, on laptops. > > >> > > >> * Steps and their intputs and outputs are parameterised, e.g. by date. > > >> A parameterised step is typically independent from other instances of > > >> the same step, e.g. join one day's impressions log with user > > >> demographics. In some cases, steps depend on yesterday's results, e.g. > > >> apply one day's user management operation log to the user dataset from > > >> the day before. > > >> > > >> * Data pipelines are specified in embedded DSL files (e.g. aurora > > >> files), kept close to the business logic code. > > >> > > >> * Batch steps should be started soon after the input files become > > >> available. > > >> > > >> * Steps should gracefully avoid recomputation when output files exist. > > >> > > >> * Backfilling a window back in time, e.g. 30 days, should happen > > >> automatically if some earlier steps have failed, or if output files > > >> have been deleted manually. > > >> > > >> * Continuous deployment in the sense that steps are automatically > > >> deployed and scheduled after 'git push'. > > >> >
Re: Data processing pipeline workflow management
Any thoughts on integrating a plug in service with Marathon first then layer Mesos on top? On Wednesday, March 11, 2015, Mattmann, Chris A (3980) < chris.a.mattm...@jpl.nasa.gov> wrote: > Apache OODT now has a workflow plugin that connects to Mesos: > > http://oodt.apache.org/ > > Cross posting this to dev@oodt.apache.org so people like > Mike Starch can chime in. > > ++ > Chris Mattmann, Ph.D. > Chief Architect > Instrument Software and Science Data Systems Section (398) > NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA > Office: 168-519, Mailstop: 168-527 > Email: chris.a.mattm...@nasa.gov > WWW: http://sunset.usc.edu/~mattmann/ > ++ > Adjunct Associate Professor, Computer Science Department > University of Southern California, Los Angeles, CA 90089 USA > ++ > > > > > > > -Original Message- > From: Zameer Manji > > Reply-To: "d...@aurora.incubator.apache.org " > > > Date: Wednesday, March 11, 2015 at 3:21 PM > To: "d...@aurora.incubator.apache.org " < > d...@aurora.incubator.apache.org > > Subject: Re: Data processing pipeline workflow management > > >Hey, > > > >This is a great question. See my comments inline below. > > > >On Tue, Mar 10, 2015 at 8:28 AM, Lars Albertsson > >> > >wrote: > > > >> We are evaluating Aurora as a workflow management tool for batch > >> processing pipelines. We basically need a tool that regularly runs > >> batch processes that are connected as producers/consumers of data, > >> typically stored in HDFS or S3. > >> > >> The alternative tools would be Azkaban, Luigi, and Oozie, but I am > >> hoping that building something built on Aurora would result in a > >> better solution. > >> > >> Does anyone have experience with building workflows with Aurora? How > >> is Twitter handling batch pipelines? Would the approach below make > >> sense, or are there better suggestions? Is there anything related to > >> this in the roadmap or available inside Twitter only? > >> > > > >As far as I know, you are the first person to consider Aurora for workflow > >management for batch processing. Currently Twitter does not use Aurora for > >batch pipelines. > >I'm not aware of the specifics of the design, but at Twitter there is an > >internal solution for pipelines built upon Hadoop/YARN. > >Currently Aurora is designed around being a service scheduler and I'm not > >aware of any future plans to support workflows or batch computation. > > > > > >> In our case, the batch processes will be a mix of cluster > >> computation's with Spark, and single-node computations. We want the > >> latter to also be scheduled on a farm, and this is why we are > >> attracted to Mesos. In the text below, I'll call each part of a > >> pipeline a 'step', in order to avoid confusion with Aurora jobs and > >> tasks. > >> > >> My unordered wishlist is: > >> * Data pipelines consist of DAGs, where steps take one or more inputs, > >> and generate one or more outputs. > >> > >> * Independent steps in the DAG execute in parallel, constrained by > >> resources. > >> > >> * Steps can be written in different languages and frameworks, some > >> clustered. > >> > >> * The developer code/test/debug cycle is quick, and all functional > >> tests can execute on a laptop. > >> > >> * Developers can test integrated data pipelines, consisting of > >> multiple steps, on laptops. > >> > >> * Steps and their intputs and outputs are parameterised, e.g. by date. > >> A parameterised step is typically independent from other instances of > >> the same step, e.g. join one day's impressions log with user > >> demographics. In some cases, steps depend on yesterday's results, e.g. > >> apply one day's user management operation log to the user dataset from > >> the day before. > >> > >> * Data pipelines are specified in embedded DSL files (e.g. aurora > >> files), kept close to the business logic code. > >> > >> * Batch steps should be started soon after the input files become > >> available. > >> > >> * Steps should gracefully avoid recomputation when output files exist. > >> > >> * Backfilling a window back in time, e.g. 30 days, should happen > >> automatically if some earlier steps have failed, or if output files > >> have been deleted manually. > >> > >> * Continuous deployment in the sense that steps are automatically > >> deployed and scheduled after 'git push'. > >> > >> * Step owners can get an overview of step status and history, and > >> debug step execution, e.g. by accessing log files. > >> > >> > >> I am aware that no framework will give us everything. It is a matter > >> of how much we need to live without or build ourselves. > >> > > > >Your wishlist looks pretty reasonable for batch computation workflows. > > > >I'm not aware of any batch/workflow Mesos framework. If you want some or > >all of the above features on top of Mes