Re: Hadoop API performance testcases
Hi Amir, Did you have any specific APIs in mind? There's obviously, sort, shuffle, gridmix, DFSIO etc. which measure different aspects of Hadoop job execution. Ravi On Tue, Jul 10, 2012 at 12:58 PM, Amir Sanjar wrote: > > Hi all, are there any performance testcases for hadoop APIs? We are looking > for testcases to time performance of each API. > > Best Regards > Amir Sanjar > > >
Re: Jetty fixes for Hadoop
One question I did have on this is if anyone is now seeing more jetty issues on the datanode with possibly increased usage via webhdfs - do we need something similar to https://issues.apache.org/jira/browse/MAPREDUCE-3184 on the datanode side? Tom On 7/10/12 5:19 PM, "Todd Lipcon" wrote: > +1 from me too. We've had this in CDH since Sep '11 and been working > much better than the stock 6.1.26. > > -Todd > > On Tue, Jul 10, 2012 at 3:14 PM, Owen O'Malley wrote: >> On Tue, Jul 10, 2012 at 2:59 PM, Thomas Graves wrote: >> >>> I'm +1 for adding it. >>> >> >> I'm +1 also. >> >> -- Owen > >
[jira] [Created] (HADOOP-8588) SerializationFactory shouldn't throw a NullPointerException if the serializations list is empty
Harsh J created HADOOP-8588: --- Summary: SerializationFactory shouldn't throw a NullPointerException if the serializations list is empty Key: HADOOP-8588 URL: https://issues.apache.org/jira/browse/HADOOP-8588 Project: Hadoop Common Issue Type: Improvement Components: io Affects Versions: 2.0.0-alpha Reporter: Harsh J Priority: Minor The SerializationFactory throws an NPE if CommonConfigurationKeys.IO_SERIALIZATIONS_KEY is set to an empty list in the config. It should rather print a WARN log indicating the serializations list is empty, and start up without any valid serialization classes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Jetty fixes for Hadoop
I am +1 on this also, although I think we need to look at moving to Jetty-7 or possibly dropping Jetty completely and look at Netty or even Tomcat long term. Jetty has just been way too unstable at Hadoop scale and that has not really changed with newer versions of Jetty. Sticking with an old forked unsupported version of Jetty longterm seems very risky too me. --Bobby On 7/10/12 5:19 PM, "Todd Lipcon" wrote: >+1 from me too. We've had this in CDH since Sep '11 and been working >much better than the stock 6.1.26. > >-Todd > >On Tue, Jul 10, 2012 at 3:14 PM, Owen O'Malley wrote: >> On Tue, Jul 10, 2012 at 2:59 PM, Thomas Graves >>wrote: >> >>> I'm +1 for adding it. >>> >> >> I'm +1 also. >> >> -- Owen > > > >-- >Todd Lipcon >Software Engineer, Cloudera
Re: Jetty fixes for Hadoop
I agree that we should explore alternatives the forked version of Jetty. This is a longer term goal. In the interim, lets do jetty-6.1.26 + fixes. On Wed, Jul 11, 2012 at 6:03 AM, Robert Evans wrote: > I am +1 on this also, although I think we need to look at moving to > Jetty-7 or possibly dropping Jetty completely and look at Netty or even > Tomcat long term. Jetty has just been way too unstable at Hadoop scale > and that has not really changed with newer versions of Jetty. Sticking > with an old forked unsupported version of Jetty longterm seems very risky > too me. > > --Bobby > > On 7/10/12 5:19 PM, "Todd Lipcon" wrote: > > >+1 from me too. We've had this in CDH since Sep '11 and been working > >much better than the stock 6.1.26. > > > >-Todd > > > >On Tue, Jul 10, 2012 at 3:14 PM, Owen O'Malley > wrote: > >> On Tue, Jul 10, 2012 at 2:59 PM, Thomas Graves > >>wrote: > >> > >>> I'm +1 for adding it. > >>> > >> > >> I'm +1 also. > >> > >> -- Owen > > > > > > > >-- > >Todd Lipcon > >Software Engineer, Cloudera > >
Re: Jetty fixes for Hadoop
In 2.0 we've reimplemented the shuffle using Netty for a significant speed up. Moving webhdfs to Netty would be interesting. -- Owen On Jul 11, 2012, at 6:03, Robert Evans wrote: > I am +1 on this also, although I think we need to look at moving to > Jetty-7 or possibly dropping Jetty completely and look at Netty or even > Tomcat long term. Jetty has just been way too unstable at Hadoop scale > and that has not really changed with newer versions of Jetty. Sticking > with an old forked unsupported version of Jetty longterm seems very risky > too me. > > --Bobby > > On 7/10/12 5:19 PM, "Todd Lipcon" wrote: > >> +1 from me too. We've had this in CDH since Sep '11 and been working >> much better than the stock 6.1.26. >> >> -Todd >> >> On Tue, Jul 10, 2012 at 3:14 PM, Owen O'Malley wrote: >>> On Tue, Jul 10, 2012 at 2:59 PM, Thomas Graves >>> wrote: >>> I'm +1 for adding it. >>> >>> I'm +1 also. >>> >>> -- Owen >> >> >> >> -- >> Todd Lipcon >> Software Engineer, Cloudera >
Re: Problem setting up 1st generation Hadoop-0.20 (ANT build) in Eclipse
I resolved the issue of "*Exception in thread "main" java.lang.InternalError: Can't connect to X11 window server using ':0' as the value of the DISPLAY variable."*. The problem was if the "ant tar" is being issued from root whereas the owner of the X Display Socket belongs to different user then, the error pops up. So to avoid this execute "*xhost +local:all*" in the terminal of the user who owns the X Display socket and then try issuing ant from root. It works fine. This does have some security concerns but this is the easiest workaround On Tue, Jul 10, 2012 at 5:50 PM, Pavan Kulkarni wrote: > I tried *"ant tar"* but it requested a forrest home directory ,so I ran > *"ant -Dforrest.home=/path/apache-forrest-0.8 compile-core tar"* > but this gets stuck at an Exception > "Exception in thread "main" java.lang.InternalError: Can't connect to X11 > window server using ':0' as the value of the DISPLAY variable." > No idea what this exception means. How come there isn't a good > documentation or a BUILDING.txt file for MR1 releases ? Any help regarding > this is appreciated.Thanks > > > On Tue, Jul 10, 2012 at 4:29 PM, Harsh J wrote: > >> Hey Pavan, >> >> Try an "ant tar". For more ant targets, read the build.xml at the root >> of your checkout. >> >> On Wed, Jul 11, 2012 at 1:15 AM, Pavan Kulkarni >> wrote: >> > Thanks a lot Harsh.I could set it up without any errors. >> > It would be great if you could provide me any pointers on how to build a >> > binary distribution tar file. >> > The information on wiki and in BUILDING.txt only has Maven >> > instructions.Thanks >> > >> > On Tue, Jul 10, 2012 at 2:39 PM, Harsh J wrote: >> > >> >> Hey Pavan, >> >> >> >> The 0.20.x version series was renamed recently to 1.x. Hence, you need >> >> to use the branch-1 code path if you want the latest stable branch >> >> (MR1, etc.) code. >> >> >> >> Do these once you have ant 1.8 and a Sun/Oracle JDK 1.6 installed, and >> >> you should have it: >> >> >> >> $ git clone http://github.com/apache/hadoop-common.git hadoop-1 >> >> $ cd hadoop-1; git checkout branch-1 >> >> $ ant eclipse >> >> >> >> (Now export this directory into Eclipse as a Java project) >> >> >> >> HTH. >> >> >> >> On Wed, Jul 11, 2012 at 12:00 AM, Pavan Kulkarni >> >> wrote: >> >> > Hi all, >> >> > >> >> > I am trying to setup hadoop 1st generation 0.20 in Eclipse which >> still >> >> > uses Ant as its build tool. >> >> > The build was successful , but when I want to set it up in the >> Eclipse >> >> IDE >> >> > i.e >> >> > *File-> new Project-> Project from existing ANT build file -> Select >> >> > build.xml ->Finish* >> >> > I get this following error : >> >> > *Problem setting the classpath of the project from the javac >> classpath: >> >> > Reference ivy-common.classpath not found.* >> >> > >> >> > I have tried finding solutions online but couldn't get a concrete >> one. >> >> Are >> >> > there any sources or workarounds on setting 1st generation >> >> > Hadoop in Eclipse.? >> >> > >> >> > Also my second question was how to build a binary tar file for >> >> hadoop-0.20 >> >> > which still uses ANT. The wiki pages only have information for maven. >> >> > Any help is highly appreciated.Thanks >> >> > -- >> >> > >> >> > --With Regards >> >> > Pavan Kulkarni >> >> >> >> >> >> >> >> -- >> >> Harsh J >> >> >> > >> > >> > >> > -- >> > >> > --With Regards >> > Pavan Kulkarni >> >> >> >> -- >> Harsh J >> > > > > -- > > --With Regards > Pavan Kulkarni > > -- --With Regards Pavan Kulkarni
[jira] [Resolved] (HADOOP-8565) AuthenticationFilter#doFilter warns unconditionally when using SPNEGO
[ https://issues.apache.org/jira/browse/HADOOP-8565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alejandro Abdelnur resolved HADOOP-8565. Resolution: Duplicate This is a dup of HADOOP-8355 (which I've missed to backport to branch-1, doing that now) > AuthenticationFilter#doFilter warns unconditionally when using SPNEGO > -- > > Key: HADOOP-8565 > URL: https://issues.apache.org/jira/browse/HADOOP-8565 > Project: Hadoop Common > Issue Type: Improvement >Affects Versions: 1.0.3 >Reporter: Eli Collins >Assignee: Alejandro Abdelnur > > The following code in AuthenticationFilter#doFilter throws > AuthenticationException (and warns) unconditionally because > KerberosAuthenticator#authenticate returns null if SPNEGO is used. > {code} > token = authHandler.authenticate(httpRequest, httpResponse); > ... > if (token != null) { ... } else { > throw new AuthenticationException > } > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HADOOP-8587) HarFileSystem access of harMetaCache isn't threadsafe
[ https://issues.apache.org/jira/browse/HADOOP-8587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins resolved HADOOP-8587. - Resolution: Fixed Fix Version/s: 2.0.1-alpha 1.2.0 Target Version/s: (was: 1.2.0, 2.0.1-alpha) Hadoop Flags: Reviewed Thanks for the review Daryn. I've committed this to trunk and merged to branch-2 and branch-1. > HarFileSystem access of harMetaCache isn't threadsafe > - > > Key: HADOOP-8587 > URL: https://issues.apache.org/jira/browse/HADOOP-8587 > Project: Hadoop Common > Issue Type: Bug > Components: fs >Affects Versions: 1.0.0, 2.0.0-alpha >Reporter: Eli Collins >Assignee: Eli Collins >Priority: Minor > Fix For: 1.2.0, 2.0.1-alpha > > Attachments: hadoop-8587-b1.txt, hadoop-8587.txt, hadoop-8587.txt > > > HarFileSystem's use of the static harMetaCache map is not threadsafe. Credit > to Todd for pointing this out. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
a bug in ViewFs tests
Hi, I noticed that the fix done in HADOOP-8036 (failing ViewFs tests) was reverted later when resolving HADOOP-8129, so the bug exists both in 0.23 and 2.0. I'm going to provide an alternative fix. Should I reopen HADOOP-8036 or create a new one instead? Thanks. -- Andrey Klochkov
[jira] [Resolved] (HADOOP-8365) Add flag to disable durable sync
[ https://issues.apache.org/jira/browse/HADOOP-8365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Srinivas resolved HADOOP-8365. - Resolution: Fixed Release Note: This patch enables durable sync by default. Installation where HBase was not used, that used to run without setting {{dfs.support.append}} or setting it to false in the configurate, must set {{dfs.durable.sync}} to false to preserve the previous semantics. Hadoop Flags: Incompatible change,Reviewed (was: Reviewed) > Add flag to disable durable sync > > > Key: HADOOP-8365 > URL: https://issues.apache.org/jira/browse/HADOOP-8365 > Project: Hadoop Common > Issue Type: Improvement >Affects Versions: 1.1.0 >Reporter: Eli Collins >Assignee: Eli Collins >Priority: Blocker > Fix For: 1.1.0 > > Attachments: hadoop-8365.txt, hadoop-8365.txt > > > Per HADOOP-8230 there's a request for a flag to disable the sync code paths > that dfs.support.append used to enable. The sync method itself will still be > available and have a broken implementation as that was the behavior before > HADOOP-8230. This config flag should default to false as the primary > motivation for HADOOP-8230 is so HBase works out-of-the-box with Hadoop 1.1. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: a bug in ViewFs tests
Hey Andrew. I'd open a new jira, and thanks for the find btw! Thanks, Eli On Wed, Jul 11, 2012 at 4:08 PM, Andrey Klochkov wrote: > Hi, > I noticed that the fix done in HADOOP-8036 (failing ViewFs tests) was > reverted later when resolving HADOOP-8129, so the bug exists both in > 0.23 and 2.0. I'm going to provide an alternative fix. Should I reopen > HADOOP-8036 or create a new one instead? Thanks. > > -- > Andrey Klochkov
[jira] [Created] (HADOOP-8589) ViewFs tests fail when tests dir is under Jenkins home dir
Andrey Klochkov created HADOOP-8589: --- Summary: ViewFs tests fail when tests dir is under Jenkins home dir Key: HADOOP-8589 URL: https://issues.apache.org/jira/browse/HADOOP-8589 Project: Hadoop Common Issue Type: Bug Components: fs, test Affects Versions: 2.0.0-alpha, 0.23.1 Reporter: Andrey Klochkov TestFSMainOperationsLocalFileSystem fails in case when the test root directory is under the user's home directory, and the user's home dir is deeper than 2 levels from /. This happens with the default 1-node installation of Jenkins. This is the failure log: {code} org.apache.hadoop.fs.FileAlreadyExistsException: Path /var already exists as dir; cannot create link here at org.apache.hadoop.fs.viewfs.InodeTree.createLink(InodeTree.java:244) at org.apache.hadoop.fs.viewfs.InodeTree.(InodeTree.java:334) at org.apache.hadoop.fs.viewfs.ViewFileSystem$1.(ViewFileSystem.java:167) at org.apache.hadoop.fs.viewfs.ViewFileSystem.initialize(ViewFileSystem.java:167) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2094) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:79) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2128) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2110) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:290) at org.apache.hadoop.fs.viewfs.ViewFileSystemTestSetup.setupForViewFileSystem(ViewFileSystemTestSetup.java:76) at org.apache.hadoop.fs.viewfs.TestFSMainOperationsLocalFileSystem.setUp(TestFSMainOperationsLocalFileSystem.java:40) ... Standard Output 2012-07-11 22:07:20,239 INFO mortbay.log (Slf4jLog.java:info(67)) - Home dir base /var/lib {code} The reason for the failure is that the code tries to mount links for both "/var" and "/var/lib", and it fails for the 2nd one as the "/var" is mounted already. The fix was provided in HADOOP-8036 but later it was reverted in HADOOP-8129. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: plz help! HDFS to S3 copy issues
This is a hadoop-user q, not a development one -please use the right as user questions get ignored on the dev ones. also: http://wiki.apache.org/hadoop/ConnectionRefused On 11 July 2012 19:23, Momina Khan wrote: > i use the following command to try to copy data from hdfs to my s3 bucket > > ubuntu@domU-12-31-39-04-6E-58: > /state/partition1/hadoop-1.0.1$ *bin/hadoop distcp hdfs:// > 10.240.113.162:9001/data/ s3://ID:**SECRET@momin*a > > java throws a connection refused exception ... i am running just this one > instance the same URI works fine for other hdfs commands ... even localhost > gives the same error > plz help i have also tried hftp but i am guessing connection refused is not > a distcp error ... have tried all i can ... i have the authentication > certificate private key in place ... could it be an authentication failure > ... but the connection refused is mentioned on the hdfs URI. > > plz anyone help ... have tried google extensively!* > > *Find the call trace attached below: > > ubuntu@domU-12-31-39-04-6E-58: > /state/partition1/hadoop-1.0.1$ *bin/hadoop distcp hdfs:// > 10.240.113.162:9001/data/ s3://ID:**SECRET@momina > * > > 12/07/05 12:48:37 INFO tools.DistCp: srcPaths=[hdfs:// > 10.240.113.162:9001/data] > 12/07/05 12:48:37 INFO tools.DistCp: destPath=s3://ID:SECRET@momina > > 12/07/05 12:48:38 INFO ipc.Client: Retrying connect to server: > domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already > tried 0 time(s). > 12/07/05 12:48:39 INFO ipc.Client: Retrying connect to server: > domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already > tried 1 time(s). > 12/07/05 12:48:40 INFO ipc.Client: Retrying connect to server: > domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already > tried 2 time(s). > 12/07/05 12:48:41 INFO ipc.Client: Retrying connect to server: > domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already > tried 3 time(s). > 12/07/05 12:48:42 INFO ipc.Client: Retrying connect to server: > domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already > tried 4 time(s). > 12/07/05 12:48:43 INFO ipc.Client: Retrying connect to server: > domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already > tried 5 time(s). > 12/07/05 12:48:44 INFO ipc.Client: Retrying connect to server: > domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already > tried 6 time(s). > 12/07/05 12:48:45 INFO ipc.Client: Retrying connect to server: > domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already > tried 7 time(s). > 12/07/05 12:48:46 INFO ipc.Client: Retrying connect to server: > domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already > tried 8 time(s). > 12/07/05 12:48:47 INFO ipc.Client: Retrying connect to server: > domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already > tried 9 time(s). > With failures, global counters are inaccurate; consider running with -i > Copy failed: java.net.ConnectException: Call to > domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001 failed on > connection exception: java.net.ConnectException: Connection refused > at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095) > at org.apache.hadoop.ipc.Client.call(Client.java:1071) > at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225) > at $Proxy1.getProtocolVersion(Unknown Source) > at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396) > at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379) > at > org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119) > at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:238) > at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:203) > at > > org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89) > at > org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386) > at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66) > at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404) > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254) > at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187) > at org.apache.hadoop.tools.DistCp.checkSrcPath(DistCp.java:635) > at org.apache.hadoop.tools.DistCp.copy(DistCp.java:656) > at org.apache.hadoop.tools.DistCp.run(DistCp.java:881) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) > at org.apache.hadoop.tools.DistCp.main(DistCp.java:908) > Caused by: java.net.ConnectException: Connection refused > at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) > at > sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567) > at > > org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) > at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489) > at > org.apache.hadoop.ipc.Client$Connection.set
[jira] [Created] (HADOOP-8590) Backport HADOOP-7318 (MD5Hash factory should reset the digester it returns) to branch-1
Todd Lipcon created HADOOP-8590: --- Summary: Backport HADOOP-7318 (MD5Hash factory should reset the digester it returns) to branch-1 Key: HADOOP-8590 URL: https://issues.apache.org/jira/browse/HADOOP-8590 Project: Hadoop Common Issue Type: Bug Components: io Affects Versions: 1.0.3 Reporter: Todd Lipcon I ran into this bug on branch-1 today, it seems like we should backport it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira