Re: multioutput dfs.datanode.max.xcievers and too many open files
Hi Marc, Take a look at How to Increase Open File Limithttp://www.hypertable.com/documentation/misc/how_to_increase_open_file_limit/for instructions on how to increase the file limit. - Doug On Thu, Feb 23, 2012 at 7:26 AM, Marc Sturlese marc.sturl...@gmail.comwrote: Hey there, I've been running a cluster for about a year (about 20 machines). I've run many concurrent jobs there and some of them with multiOutput and never had any problem (multiOutputs where creating just 3 or 4 different outputs). Now I've a job with multiOutputs that creates 100 different outputs and it always end up with errors. Tasks start throwing this erros: java.io.IOException: Bad connect ack with firstBadLink 10.2.0.154:50010 at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2963) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2888) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1900(DFSClient.java:2139) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2329) or: java.io.EOFException at java.io.DataInputStream.readByte(DataInputStream.java:250) at org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:298) at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:319) at org.apache.hadoop.io.Text.readString(Text.java:400) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2961) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2888) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1900(DFSClient.java:2139) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2329) Checking the datanode log I see hundreds of times this error: 2012-02-23 14:22:56,008 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Reopen already-open Block for append blk_336844604470452_29464903 2012-02-23 14:22:56,008 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock blk_336844604470452_29464903 received exception java.net.SocketException: Too many open files 2012-02-23 14:22:56,008 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(10.2.0.156:50010, storageID=DS-1194175480-10.2.0.156-50010-1329304363220, infoPort=50075, ipcPort=50020):DataXceiver java.net.SocketException: Too many open files at sun.nio.ch.Net.socket0(Native Method) at sun.nio.ch.Net.socket(Net.java:97) at sun.nio.ch.SocketChannelImpl.init(SocketChannelImpl.java:84) at sun.nio.ch.SelectorProviderImpl.openSocketChannel(SelectorProviderImpl.java:37) at java.nio.channels.SocketChannel.open(SocketChannel.java:105) at org.apache.hadoop.hdfs.server.datanode.DataNode.newSocket(DataNode.java:429) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:296) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:118) 2012-02-23 14:22:56,034 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block blk_-2698946892792040969_29464904 src: /10.2.0.156:40969 dest: /10.2.0.156:50010 2012-02-23 14:22:56,035 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock blk_-2698946892792040969_29464904 received exception java.net.SocketException: Too many open files 2012-02-23 14:22:56,035 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(10.2.0.156:50010, storageID=DS-1194175480-10.2.0.156-50010-1329304363220, infoPort=50075, ipcPort=50020):DataXceiver java.net.SocketException: Too many open files at sun.nio.ch.Net.socket0(Native Method) at sun.nio.ch.Net.socket(Net.java:97) at sun.nio.ch.SocketChannelImpl.init(SocketChannelImpl.java:84) at sun.nio.ch.SelectorProviderImpl.openSocketChannel(SelectorProviderImpl.java:37) at java.nio.channels.SocketChannel.open(SocketChannel.java:105) at org.apache.hadoop.hdfs.server.datanode.DataNode.newSocket(DataNode.java:429) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:296) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:118) I've always had configured in hdfs-site.xml: property namedfs.datanode.max.xcievers/name value4096/value /property But I think now it's not enough to handle that many multipleOutputs. If I increase even more max.xcievers which are de side effects? Wich value should be considered as maximum (I suppose it depends on the CPU and RAM, but aprox). Thanks in advance. -- View this message in context: http://lucene.472066.n3.nabble.com/multioutput-dfs-datanode-max-xcievers-and-too-many-open-files-tp3770024p3770024.html Sent from the Hadoop
Re: Addendum to Hypertable vs. HBase Performance Test (w/ mslab enabled)
Hi Edward, In the 1/2 trillion record use case that I referred to, data is streaming in from a realtime feed and needs to be online immediately, so bulk loading is not an option. In the 41 billion and 167 billion record insert tests, HBase consistently failed. We tried everything we could think of, but nothing helped. We even reduce the number of load processes on each client machine from 12 down to 1 so that the data was just trickling in and HBase still failed with *Concurrent Mode Failure*. It appeared that eventually too maintenance activity built up in the Region servers causing them to fail. - Doug On Fri, Feb 17, 2012 at 5:58 PM, Edward Capriolo edlinuxg...@gmail.comwrote: As your numbers show. Dataset SizeHypertable Queries/s HBase Queries/s Hypertable Latency (ms)HBase Latency (ms) 0.5 TB 3256.42 2969.52 157.221 172.351 5 TB2450.01 2066.52 208.972 247.680 Raw data goes up. Read performance goes down. Latency goes up. You mentioned you loaded 1/2 trillion records of historical financial data. The operative word is historical. Your not doing 1/2 trillion writes every day. Most of the system that use structured log formats can write very fast (I am guessing that is what hypertable uses btw). DD writes very fast as well, but if you want acceptable read latency you are going to need a good RAM/disk ratio. Even at 0.5 TB 157.221ms is not a great read latency, so your ability to write fast has already outstripped your ability to read at a rate that could support say web application. (I come from a world of 1-5ms latency BTW). What application can you support with numbers like that? An email compliance system where you want to store a ton of data, but only plan of doing 1 search a day to make an auditor happy? :) This is why I say your going to end up needing about the same # of nodes because when it comes time to read this data having a machine with 4Tb of data and 24 GB ram is not going to cut it. You are right on a couple of fronts 1) being able to load data fast is good (can't argue with that) 2) If hbase can't load X entries that is bad I really can't imagine that hbase blows up and just stops accepting inserts at one point. You seem to say its happening and I don't have time to verify. But if you are at the point where you are getting 175ms random and 85 zipfan latency what are you proving that is already more data then a server can handle. http://en.wikipedia.org/wiki/Network_performance Users browsing the Internet feel that responses are instant when delays are less than 100 ms from click to response[11]. Latency and throughput together affect the perceived speed of a connection. However, the perceived performance of a connection can still vary widely, depending in part on the type of information transmitted and how it is used. On Fri, Feb 17, 2012 at 7:25 PM, Doug Judd d...@hypertable.com wrote: Hi Edward, The problem is that even if the workload is 5% write and 95% read, if you can't load the data, you need more machines. In the 167 billion insert test, HBase failed with *Concurrent mode failure* after 20% of the data was loaded. One of our customers has loaded 1/2 trillion records of historical financial market data on 16 machines. If you do the back-of-the-envelope calculation, it would take about 180 machines for HBase to load 1/2 trillion cells. That makes HBase 10X more expensive in terms of hardware, power consumption, and data center real estate. - Doug On Fri, Feb 17, 2012 at 3:58 PM, Edward Capriolo edlinuxg...@gmail.com wrote: I would almost agree with prospective. But their is a problem with 'java is slow' theory. The reason is that in a 100 percent write workload gc might be a factor. But in the real world people have to read data and read becomes disk bound as your data gets larger then memory. Unless C++ can make your disk spin faster then java It is a wash. Making a claim that your going to need more servers for java/hbase is bogus. To put it in prospective, if the workload is 5 % write and 95 % read you are probably going to need just the same amount of hardware. You might get some win on the read size because your custom caching could be more efficient in terms of object size in memory and other gc issues but it is not 2 or 3 to one. If a million writes fall into a hypertable forest but it take a billion years to read them back did the writes ever sync :) On Monday, February 13, 2012, Doug Judd d...@hypertable.com wrote: Hey Todd, Bulk loading isn't always an option when data is streaming in from a live application. Many big data use cases involve massive amounts of smaller items in the size range of 10-100 bytes, for example URLs, sensor readings, genome sequence reads, network traffic logs, etc. If HBase requires 2-3 times
Re: Addendum to Hypertable vs. HBase Performance Test (w/ mslab enabled)
Hi Edward, The problem is that even if the workload is 5% write and 95% read, if you can't load the data, you need more machines. In the 167 billion insert test, HBase failed with *Concurrent mode failure* after 20% of the data was loaded. One of our customers has loaded 1/2 trillion records of historical financial market data on 16 machines. If you do the back-of-the-envelope calculation, it would take about 180 machines for HBase to load 1/2 trillion cells. That makes HBase 10X more expensive in terms of hardware, power consumption, and data center real estate. - Doug On Fri, Feb 17, 2012 at 3:58 PM, Edward Capriolo edlinuxg...@gmail.comwrote: I would almost agree with prospective. But their is a problem with 'java is slow' theory. The reason is that in a 100 percent write workload gc might be a factor. But in the real world people have to read data and read becomes disk bound as your data gets larger then memory. Unless C++ can make your disk spin faster then java It is a wash. Making a claim that your going to need more servers for java/hbase is bogus. To put it in prospective, if the workload is 5 % write and 95 % read you are probably going to need just the same amount of hardware. You might get some win on the read size because your custom caching could be more efficient in terms of object size in memory and other gc issues but it is not 2 or 3 to one. If a million writes fall into a hypertable forest but it take a billion years to read them back did the writes ever sync :) On Monday, February 13, 2012, Doug Judd d...@hypertable.com wrote: Hey Todd, Bulk loading isn't always an option when data is streaming in from a live application. Many big data use cases involve massive amounts of smaller items in the size range of 10-100 bytes, for example URLs, sensor readings, genome sequence reads, network traffic logs, etc. If HBase requires 2-3 times the amount of hardware to avoid *Concurrent mode failures*, then that makes HBase 2-3 times more expensive from the standpoint of hardware, power consumption, and datacenter real estate. What takes the most time is getting the core database mechanics right (we're going on 5 years now). Once the core database is stable, integration with applications such as Solr and others are short term projects. I believe that sooner or later, most engineers working in this space will come to the conclusion that Java is the wrong language for this kind of database application. At that point, folks on the HBase project will realize that they are five years behind. - Doug On Mon, Feb 13, 2012 at 11:33 AM, Todd Lipcon t...@cloudera.com wrote: Hey Doug, Want to also run a comparison test with inter-cluster replication turned on? How about kerberos-based security on secure HDFS? How about ACLs or other table permissions even without strong authentication? Can you run a test comparing performance running on top of Hadoop 0.23? How about running other ecosystem products like Solbase, Havrobase, and Lily, or commercial products like Digital Reasoning's Synthesys, etc? For those unfamiliar, the answer to all of the above is that those comparisons can't be run because Hypertable is years behind HBase in terms of features, adoption, etc. They've found a set of benchmarks they win at, but bulk loading either database through the put API is the wrong way to go about it anyway. Anyone loading 5T of data like this would use the bulk load APIs which are one to two orders of magnitude more efficient. Just ask the Yahoo crawl cache team, who has ~1PB stored in HBase, or Facebook, or eBay, or many others who store hundreds to thousands of TBs in HBase today. Thanks, -Todd On Mon, Feb 13, 2012 at 9:07 AM, Doug Judd d...@hypertable.com wrote: In our original test, we mistakenly ran the HBase test with the hbase.hregion.memstore.mslab.enabled property set to false. We re-ran the test with the hbase.hregion.memstore.mslab.enabled property set to true and have reported the results in the following addendum: Addendum to Hypertable vs. HBase Performance Test http://www.hypertable.com/why_hypertable/hypertable_vs_hbase_2/addendum/ Synopsis: It slowed performance on the 10KB and 1KB tests and still failed the 100 byte and 10 byte tests with *Concurrent mode failure* - Doug -- Todd Lipcon Software Engineer, Cloudera
Addendum to Hypertable vs. HBase Performance Test (w/ mslab enabled)
In our original test, we mistakenly ran the HBase test with the hbase.hregion.memstore.mslab.enabled property set to false. We re-ran the test with the hbase.hregion.memstore.mslab.enabled property set to true and have reported the results in the following addendum: Addendum to Hypertable vs. HBase Performance Testhttp://www.hypertable.com/why_hypertable/hypertable_vs_hbase_2/addendum/ Synopsis: It slowed performance on the 10KB and 1KB tests and still failed the 100 byte and 10 byte tests with *Concurrent mode failure* - Doug
Re: Addendum to Hypertable vs. HBase Performance Test (w/ mslab enabled)
Hey Todd, Bulk loading isn't always an option when data is streaming in from a live application. Many big data use cases involve massive amounts of smaller items in the size range of 10-100 bytes, for example URLs, sensor readings, genome sequence reads, network traffic logs, etc. If HBase requires 2-3 times the amount of hardware to avoid *Concurrent mode failures*, then that makes HBase 2-3 times more expensive from the standpoint of hardware, power consumption, and datacenter real estate. What takes the most time is getting the core database mechanics right (we're going on 5 years now). Once the core database is stable, integration with applications such as Solr and others are short term projects. I believe that sooner or later, most engineers working in this space will come to the conclusion that Java is the wrong language for this kind of database application. At that point, folks on the HBase project will realize that they are five years behind. - Doug On Mon, Feb 13, 2012 at 11:33 AM, Todd Lipcon t...@cloudera.com wrote: Hey Doug, Want to also run a comparison test with inter-cluster replication turned on? How about kerberos-based security on secure HDFS? How about ACLs or other table permissions even without strong authentication? Can you run a test comparing performance running on top of Hadoop 0.23? How about running other ecosystem products like Solbase, Havrobase, and Lily, or commercial products like Digital Reasoning's Synthesys, etc? For those unfamiliar, the answer to all of the above is that those comparisons can't be run because Hypertable is years behind HBase in terms of features, adoption, etc. They've found a set of benchmarks they win at, but bulk loading either database through the put API is the wrong way to go about it anyway. Anyone loading 5T of data like this would use the bulk load APIs which are one to two orders of magnitude more efficient. Just ask the Yahoo crawl cache team, who has ~1PB stored in HBase, or Facebook, or eBay, or many others who store hundreds to thousands of TBs in HBase today. Thanks, -Todd On Mon, Feb 13, 2012 at 9:07 AM, Doug Judd d...@hypertable.com wrote: In our original test, we mistakenly ran the HBase test with the hbase.hregion.memstore.mslab.enabled property set to false. We re-ran the test with the hbase.hregion.memstore.mslab.enabled property set to true and have reported the results in the following addendum: Addendum to Hypertable vs. HBase Performance Test http://www.hypertable.com/why_hypertable/hypertable_vs_hbase_2/addendum/ Synopsis: It slowed performance on the 10KB and 1KB tests and still failed the 100 byte and 10 byte tests with *Concurrent mode failure* - Doug -- Todd Lipcon Software Engineer, Cloudera
Hypertable vs. HBase Performance Test
Hello, We recently conducted a test comparing the performance of Hypertable with that of HBase. The test is summarized in the following High Scalability post: Hypertable Routs HBase in Performance Test -- HBase Overwhelmed by Garbage Collectionhttp://highscalability.com/blog/2012/2/7/hypertable-routs-hbase-in-performance-test-hbase-overwhelmed.html We've also unveiled a new website, www.hypertable.com. If you haven't checked out Hypertable in a while, now would be a great time. - Doug
Re: Hypertable vs. HBase Performance Test
I'm happy to post it there as well. I do think it is of interest to Hadoop users in general, especially ones that are not using HBase due to performance problems. - Doug On Tue, Feb 7, 2012 at 9:45 AM, Harsh J ha...@cloudera.com wrote: Hi Doug, Perhaps this should go to the HBase users@ list instead of Hadoop common-user@? On Tue, Feb 7, 2012 at 11:09 PM, Doug Judd nuggetwh...@gmail.com wrote: Hello, We recently conducted a test comparing the performance of Hypertable with that of HBase. The test is summarized in the following High Scalability post: Hypertable Routs HBase in Performance Test -- HBase Overwhelmed by Garbage Collection http://highscalability.com/blog/2012/2/7/hypertable-routs-hbase-in-performance-test-hbase-overwhelmed.html We've also unveiled a new website, www.hypertable.com. If you haven't checked out Hypertable in a while, now would be a great time. - Doug -- Harsh J Customer Ops. Engineer Cloudera | http://tiny.cloudera.com/about
Hypertable vs. HBase Performance Evaluation
We recently did a performance evaluation comparing Hypertable with HBase. We thought users on this list might be interested in the results. The following link points to a blog post which summarizes the results and has a pointer to the full test report. http://blog.hypertable.com/?p=14 - Doug
Hypertable binary packages available
Hypertable (www.hypertable.org) is an open source C++ implementation of Bigtable which runs on top of HDFS. Binary packages (RPM, debian, dmg) for Hypertable are now available and can be downloaded here: http://package.hypertable.org/ Updated documentation, with a Getting Started guide, can be found here: http://www.hypertable.org/documentation.html - Doug Judd