java.net.SocketTimeoutException: read(2) error: Resource temporarily unavailable

2014-07-06 Thread lei liu
I use hbase-0.94 and hadoop-2.2, there is below exception:

2014-07-04 12:43:49,700 WARN org.apache.hadoop.hdfs.DFSClient: failed to
connect to
DomainSocket(fd=322,path=/home/hadoop/hadoop-current/cdh4-dn-socket/dn_socket)

java.net.SocketTimeoutException: read(2) error: Resource temporarily
unavailable

at org.apache.hadoop.net.unix.DomainSocket.readArray0(Native Method)

at
org.apache.hadoop.net.unix.DomainSocket.access$200(DomainSocket.java:47)

at
org.apache.hadoop.net.unix.DomainSocket$DomainInputStream.read(DomainSocket.java:530)

at java.io.FilterInputStream.read(FilterInputStream.java:66)

at
org.apache.hadoop.hdfs.protocol.HdfsProtoUtil.vintPrefixed(HdfsProtoUtil.java:169)

at
org.apache.hadoop.hdfs.BlockReaderFactory.newShortCircuitBlockReader(BlockReaderFactory.java:187)

at
org.apache.hadoop.hdfs.BlockReaderFactory.newBlockReader(BlockReaderFactory.java:104)

at
org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:1060)

at
org.apache.hadoop.hdfs.DFSInputStream.fetchBlockByteRange(DFSInputStream.java:898)

at
org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:1148)

at
org.apache.hadoop.fs.FSDataInputStream.read(FSDataInputStream.java:73)

at
org.apache.hadoop.hbase.io.hfile.HFileBlock$AbstractFSReader.readAtOffset(HFileBlock.java:1388)

at
org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockDataInternal(HFileBlock.java:1880)

at
org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockData(HFileBlock.java:1723)

at
org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:365)

at
org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.readNextDataBlock(HFileReaderV2.java:633)

at
org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.next(HFileReaderV2.java:730)

at
org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:128)



why does appear the exception java.net.SocketTimeoutException: read(2)
error: Resource temporarily unavailable?


 Thanks,
LiuLei


hdfs cache

2014-04-21 Thread lei liu
I use hadoop-2.4, I want use the hdfs cache function.

I use ulimit -l  32212254720 linux command to  set size of max locked
memory, but there is below error:
 ulimit  -l  322
-bash: ulimit: max locked memory: cannot modify limit: Operation not
permitted

How can I set size of max locked memory?

Thanks,

LiuLei


heterogeneous storages in HDFS

2014-04-14 Thread lei liu
On April 11 hadoop-2.4 is released, the hadoop-2.4 does not include
heterogeneous storages function, when does hadoop include the function?

Thanks,

LiuLei


Re: heterogeneous storages in HDFS

2014-04-14 Thread lei liu
When is hadoop released?




2014-04-14 17:04 GMT+08:00 Stanley Shi s...@gopivotal.com:

 Please find it in this page:  https://wiki.apache.org/hadoop/Roadmap

 hadoop 2.3.0 only include phase 1 of the heterogeneous storage; phase
 2 will be included in 2.5.0;

 Regards,
 *Stanley Shi,*



 On Mon, Apr 14, 2014 at 4:38 PM, ascot.m...@gmail.com 
 ascot.m...@gmail.com wrote:

 hi,

 From 2.3.0
 20 February, 2014: Release 2.3.0 available

 Apache Hadoop 2.3.0 contains a number of significant enhancements such
 as:

- Support for Heterogeneous Storage hierarchy in HDFS.


 Is it already there?

 Ascot


 On 14 Apr, 2014, at 4:34 pm, lei liu liulei...@gmail.com wrote:

 On April 11 hadoop-2.4 is released, the hadoop-2.4 does not include
 heterogeneous storages function, when does hadoop include the function?

 Thanks,

 LiuLei






download hadoop-2.4

2014-04-10 Thread lei liu
Hadoop-2.4 is release, where can I download the hadoop-2.4 code from?


Thanks,

LiuLei


HDFS Client write data is slow

2014-02-24 Thread lei liu
I use Hbase-0.94 and hadoop-2.0.

I install one HDFS cluster that have 15 datanodes. If network bandwidth of
two datanodes is saturation(example 100m/s),  writing performance of the entire
hdfs cluster is slow.

I think that the slow datanodes affect the writing performance of the entire
cluster.

How does HDFS Client avoid writing data to the slow datanodes?

Thanks,

LiuLei


datanode is slow

2014-02-20 Thread lei liu
I use Hbase0.94 and CDH4. There are 25729 tcp connections in one
machine,example:
hadoop@apayhbs081 ~ $ netstat -a | wc -l
25729

The linux configration is :
   softcore0
   hardrss 1
   hardnproc   20
   softnproc   20
   hardnproc   50
   hardnproc   0
   maxlogins   4
   nproc  20480
nofile 204800


When there are 25729 tcp connections in one machine, the datanode is very
slow.
How can I resolve the question?


umount bad disk

2014-02-13 Thread lei liu
I use HBase0.96 and CDH4.3.1.

I use Short-Circuit Local Read:

property
  namedfs.client.read.shortcircuit/name
  valuetrue/value/propertyproperty
  namedfs.domain.socket.path/name
  value/home/hadoop/cdh4-dn-socket/dn_socket/value/property

When one disk is bad, because the RegionServer open some file on the
disk, so I don't run umount, example:
sudo umount -f /disk10
umount2: Device or resource busy
umount: /disk10: device is busy
umount2: Device or resource busy
umount: /disk10: device is busy

I must stop RegionServer in order to run umount command.


How can don't stop RegionServer and delete the bad disk.

Thanks,

LiuLei


hadoop security

2013-11-18 Thread lei liu
When I use the hadoop security, I must use jsvc to start datanode. Why must
use jsvc to start datanode? What are the advantages do that?

Thanks,

LiuLei


hadoop security

2013-11-11 Thread lei liu
There is DelegationToken in hadoop2. What is the role of  DelegationToken
and how to use the DelegationToken ?

Thanks,

LiuLei


Decommission DataNode

2013-10-22 Thread lei liu
In CDH3u5, when the DataNode is Decommissioned, the DataNode progress will
be  shutdown by NameNode.

But In CDH4.3.1, when the DataNode is Decommissioned, the DataNode progress
will be  not shutdown by NameNode.


When the datanode is Decommissioned, why the datanode is not automatically
shutdown by NameNode in CDH4.3.1?


Thanks,

LiuLei


ClientDatanodeProtocol.recoverBlock

2013-10-18 Thread lei liu
In CDH3u3 there is
ClientDatanodeProtocoleclipse-javadoc:%E2%98%82=hadoop-0.20.2-cdh3u5_core/src%5C/hdfs%3Corg.apache.hadoop.hdfs.protocol%7BClientDatanodeProtocol.java%E2%98%83ClientDatanodeProtocol.recoverBlock
method,  the method is used to recover block when data streaming is failed.


But in CDH4.3.1 there is not the recoverBlock method in
ClientDatanodeProtocoleclipse-javadoc:%E2%98%82=hadoop-0.20.2-cdh3u5_core/src%5C/hdfs%3Corg.apache.hadoop.hdfs.protocol%7BClientDatanodeProtocol.java%E2%98%83ClientDatanodeProtocol,
and when data streaming is failed, the block is not recovered, that whether
will lead to bug?


Thanks,

LiuLei


./bin/hdfs haadmin -transitionToActive deadlock

2013-10-12 Thread lei liu
I use CDH4.3.1, When I start NameNode,and transition one NameNode to
active, there is below deadlock:

Found one Java-level deadlock:
=
22558696@qtp-1616586953-6:
  waiting to lock monitor 0x2aaab3621f40 (object 0xf7646958, a
org.apache.hadoop.hdfs.server.namenode.NameNode),
  which is held by IPC Server handler 1 on 20020
IPC Server handler 1 on 20020:
  waiting to lock monitor 0x2aaab9052ab8 (object 0xf747f1b8, a
org.apache.hadoop.metrics2.impl.MetricsSystemImpl),
  which is held by Timer for 'NameNode' metrics system
Timer for 'NameNode' metrics system:
  waiting for ownable synchronizer 0xf764a858, (a
java.util.concurrent.locks.ReentrantReadWriteLock$FairSync),
  which is held by IPC Server handler 1 on 20020


TestHDFSCLI error

2013-10-10 Thread lei liu
I use CDH4.3.1 and run the TestHDFSCLI unit test,but there are below errors:

2013-10-10 13:05:39,671 INFO  cli.CLITestHelper
(CLITestHelper.java:displayResults(156)) -
---
2013-10-10 13:05:39,671 INFO  cli.CLITestHelper
(CLITestHelper.java:displayResults(157)) - Test ID: [1]
2013-10-10 13:05:39,671 INFO  cli.CLITestHelper
(CLITestHelper.java:displayResults(158)) -Test Description:
[ls: file using absolute path]
2013-10-10 13:05:39,671 INFO  cli.CLITestHelper
(CLITestHelper.java:displayResults(159)) -
2013-10-10 13:05:39,671 INFO  cli.CLITestHelper
(CLITestHelper.java:displayResults(163)) -   Test Commands:
[-fs hdfs://localhost.localdomain:41053 -touchz /file1]
2013-10-10 13:05:39,672 INFO  cli.CLITestHelper
(CLITestHelper.java:displayResults(163)) -   Test Commands:
[-fs hdfs://localhost.localdomain:41053 -ls /file1]
2013-10-10 13:05:39,672 INFO  cli.CLITestHelper
(CLITestHelper.java:displayResults(167)) -
2013-10-10 13:05:39,672 INFO  cli.CLITestHelper
(CLITestHelper.java:displayResults(170)) -Cleanup Commands:
[-fs hdfs://localhost.localdomain:41053 -rm /file1]
2013-10-10 13:05:39,672 INFO  cli.CLITestHelper
(CLITestHelper.java:displayResults(174)) -
2013-10-10 13:05:39,672 INFO  cli.CLITestHelper
(CLITestHelper.java:displayResults(178)) -  Comparator:
[TokenComparator]
2013-10-10 13:05:39,672 INFO  cli.CLITestHelper
(CLITestHelper.java:displayResults(180)) -  Comparision result:
[pass]
2013-10-10 13:05:39,672 INFO  cli.CLITestHelper
(CLITestHelper.java:displayResults(182)) - Expected output:
[Found 1 items]
2013-10-10 13:05:39,672 INFO  cli.CLITestHelper
(CLITestHelper.java:displayResults(184)) -   Actual output:
[Found 1 items
-rw-r--r--   1 musa.ll supergroup  0 2013-10-10 13:04 /file1
]
2013-10-10 13:05:39,673 INFO  cli.CLITestHelper
(CLITestHelper.java:displayResults(178)) -  Comparator:
[RegexpComparator]
2013-10-10 13:05:39,673 INFO  cli.CLITestHelper
(CLITestHelper.java:displayResults(180)) -  Comparision result:
[fail]
2013-10-10 13:05:39,673 INFO  cli.CLITestHelper
(CLITestHelper.java:displayResults(182)) - Expected output:
[^-rw-r--r--( )*1( )*[a-z]*( )*supergroup( )*0(
)*[0-9]{4,}-[0-9]{2,}-[0-9]{2,} [0-9]{2,}:[0-9]{2,}( )*/file1]
2013-10-10 13:05:39,673 INFO  cli.CLITestHelper
(CLITestHelper.java:displayResults(184)) -   Actual output:
[Found 1 items
-rw-r--r--   1 musa.ll supergroup  0 2013-10-10 13:04 /file1
]


How can I handle the error?


Thanks,

LiuLei


NullPointerException when start datanode

2013-09-30 Thread lei liu
I use CDH-4.3.1, When  I start datanode, there are below error:

2013-09-26 17:57:07,803 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Opened info server at
0.0.0.0:40075
2013-09-26 17:57:07,814 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: dfs.webhdfs.enabled = false
2013-09-26 17:57:07,814 INFO org.apache.hadoop.http.HttpServer: Jetty bound
to port 40075
2013-09-26 17:57:07,814 INFO org.mortbay.log: jetty-6.1.26.cloudera.2
2013-09-26 17:57:08,129 INFO org.mortbay.log: Started
SelectChannelConnector@0.0.0.0:40075
2013-09-26 17:57:08,643 INFO org.apache.hadoop.ipc.Server: Starting Socket
Reader #1 for port 40020
2013-09-26 17:57:08,698 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Opened IPC server at /
0.0.0.0:40020
2013-09-26 17:57:08,710 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Refresh request received
for nameservices: haosong-hadoop
2013-09-26 17:57:08,748 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Starting BPOfferServices
for nameservices: haosong-hadoop
2013-09-26 17:57:08,784 WARN org.apache.hadoop.hdfs.server.common.Util:
Path /home/haosong.hhs/develop/soft/hadoop-2.0.0-cdh4.3.1/data should be
specified as a URI in configuration files. Please update hdfs configuration.
2013-09-26 17:57:08,785 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool registering
(storage id unknown) service to /10.232.98.30:7000 starting to offer service
2013-09-26 17:57:08,785 WARN org.apache.hadoop.hdfs.server.common.Util:
Path /home/haosong.hhs/develop/soft/hadoop-2.0.0-cdh4.3.1/data should be
specified as a URI in configuration files. Please update hdfs configuration.
2013-09-26 17:57:08,786 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool registering
(storage id unknown) service to /10.232.98.33:7000 starting to offer service
2013-09-26 17:57:08,896 INFO org.apache.hadoop.ipc.Server: IPC Server
Responder: starting
2013-09-26 17:57:08,898 INFO org.apache.hadoop.ipc.Server: IPC Server
listener on 40020: starting
2013-09-26 17:57:09,239 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving
BP-78625276-10.232.98.30-1376034343336:blk_-2874307426466435275_16431304
src: /10.232.98.33:42654 dest: /10.232.98.33:40010
2013-09-26 17:57:09,239 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving
BP-78625276-10.232.98.30-1376034343336:blk_5332252262254683952_16431307
src: /10.232.98.30:47301 dest: /10.232.98.33:40010
2013-09-26 17:57:09,239 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving
BP-78625276-10.232.98.30-1376034343336:blk_-7540820026406349432_16431305
src: /10.232.98.30:47300 dest: /10.232.98.33:40010
2013-09-26 17:57:09,239 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving
BP-78625276-10.232.98.30-1376034343336:blk_-5489298128750533734_16431306
src: /10.232.98.33:42655 dest: /10.232.98.33:40010
2013-09-26 17:57:09,247 INFO org.apache.hadoop.hdfs.server.common.Storage:
Lock on /disk6/haosong-cdh4/data/in_use.lock acquired by nodename
24...@dw33.kgb.sqa.cm4
2013-09-26 17:57:09,271 ERROR
org.apache.hadoop.hdfs.server.datanode.DataNode:
dw33.kgb.sqa.cm4:40010:DataXceiver error processing WRITE_BLOCK operation
src: /10.232.98.33:42655 dest: /10.232.98.33:40010
java.lang.NullPointerException
at
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.init(BlockReceiver.java:159)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:452)
at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:103)
at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:67)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:222)
at java.lang.Thread.run(Thread.java:662)
2013-09-26 17:57:09,271 ERROR
org.apache.hadoop.hdfs.server.datanode.DataNode:
dw33.kgb.sqa.cm4:40010:DataXceiver error processing WRITE_BLOCK operation
src: /10.232.98.33:42654 dest: /10.232.98.33:40010
java.lang.NullPointerException
at
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.init(BlockReceiver.java:159)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:452)
at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:103)
at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:67)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:222)
at java.lang.Thread.run(Thread.java:662)
2013-09-26 17:57:09,271 ERROR
org.apache.hadoop.hdfs.server.datanode.DataNode:
dw33.kgb.sqa.cm4:40010:DataXceiver error processing WRITE_BLOCK operation
src: /10.232.98.30:47300 dest: /10.232.98.33:40010
java.lang.NullPointerException
at
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.init(BlockReceiver.java:159)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:452)
at

IncompatibleClassChangeError

2013-09-29 Thread lei liu
I use the CDH-4.3.1 and mr1, when I run one job, I am getting the following
error.
Exception in thread main java.lang.IncompatibleClassChangeError:
Found interface org.apache.hadoop.mapreduce.JobContext, but class was
expected
at 
org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.getSplits(TableInputFormatBase.java:152)
at 
org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:1063)
at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1080)
at org.apache.hadoop.mapred.JobClient.access$600(JobClient.java:174)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:992)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:945)
at java.security.AccessController.doPrivileged(Nativ
e Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at 
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:945)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:566)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:596)
at 
com.taobao.hbase.test.RandomKVGenerater.main(RandomKVGenerater.java:248)


How can I handle the error?

Thanks,

LiuLei


Re: IncompatibleClassChangeError

2013-09-29 Thread lei liu
Yes, My job is compiled in CHD3u3, and I run the job on CDH4.3.1,  but I
use the mr1 of CHD4.3.1 to run the job.

What are the different mr1 of cdh4 and mr of cdh3?

Thanks,

LiuLei


2013/9/30 Pradeep Gollakota pradeep...@gmail.com

 I believe it's a difference between the version that your code was
 compiled against vs the version that you're running against. Make sure that
 you're not packaging hadoop jar's into your jar and make sure you're
 compiling against the correct version as well.


 On Sun, Sep 29, 2013 at 7:27 PM, lei liu liulei...@gmail.com wrote:

 I use the CDH-4.3.1 and mr1, when I run one job, I am getting the
 following error.

 Exception in thread main java.lang.IncompatibleClassChangeError: Found 
 interface org.apache.hadoop.mapreduce.JobContext, but class was expected

 at 
 org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.getSplits(TableInputFormatBase.java:152)

 at 
 org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:1063)

 at 
 org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1080)

 at org.apache.hadoop.mapred.JobClient.access$600(JobClient.java:174)
 at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:992)
 at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:945)
 at java.security.AccessController.doPrivileged(Nativ
 e Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)

 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)

 at 
 org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:945)
 at org.apache.hadoop.mapreduce.Job.submit(Job.java:566)
 at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:596)

 at 
 com.taobao.hbase.test.RandomKVGenerater.main(RandomKVGenerater.java:248)


 How can I handle the error?

 Thanks,

 LiuLei





IncompatibleClassChangeError

2013-09-26 Thread lei liu
I use the CDH-4.3.1 and mr1, when I run one job, I am getting the following
error.
Exception in thread main java.lang.IncompatibleClassChangeError:
Found interface org.apache.hadoop.mapreduce.JobContext, but class was
expected
at 
org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.getSplits(TableInputFormatBase.java:152)
at 
org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:1063)
at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1080)
at org.apache.hadoop.mapred.JobClient.access$600(JobClient.java:174)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:992)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:945)
at java.security.AccessController.doPrivileged(Nativ
e Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at 
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:945)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:566)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:596)
at 
com.taobao.hbase.test.RandomKVGenerater.main(RandomKVGenerater.java:248)


Re: metric type

2013-09-06 Thread lei liu
Hello,

Can anybody answer the question?


2013/9/1 lei liu liulei...@gmail.com

 Hi Jitendr, thanks for your reply.

 If MutableCounterLong is uesed to IO/sec statistics, I think the value of 
 MutableCounterLong
 should be divided by 10 and be reseted to zero per ten seconds in 
 MutableCounterLong.snapshot
 method, is that right? But MutableCounterLong.snapshot method don't do
 that. I missed anything please tell me. Looking forward to your reply.

 Thanks,
 LiuLei


 2013/9/1 Jitendra Yadav jeetuyadav200...@gmail.com

 Yes, MutableCounterLong helps to gather DataNode read/write statics.
 There is more option available within this metric

 Regards
 Jitendra
 On 8/31/13, lei liu liulei...@gmail.com wrote:
  There is @Metric MutableCounterLong bytesWritten attribute in
  DataNodeMetrics, which is used to IO/sec statistics?
 
 
  2013/8/31 Jitendra Yadav jeetuyadav200...@gmail.com
 
  Hi,
 
  For IO/sec statistics I think MutableCounterLongRate  and
  MutableCounterLong more useful than others and for xceiver thread
  number I'm not bit sure right now.
 
  Thanks
  Jiitendra
  On Fri, Aug 30, 2013 at 1:40 PM, lei liu liulei...@gmail.com wrote:
  
   Hi  Jitendra,
   If I want to statistics number of bytes read per second,and display
 the
  result into ganglia, should I use MutableCounterLong or
 MutableGaugeLong?
  
   If I want to display current xceiver thread number in datanode into
  ganglia, should I use MutableCounterLong or MutableGaugeLong?
  
   Thanks,
   LiuLei
  
  
   2013/8/30 Jitendra Yadav jeetuyadav200...@gmail.com
  
   Hi,
  
   Below link contains the answer for your question.
  
  
 
 http://hadoop.apache.org/docs/r1.2.0/api/org/apache/hadoop/metrics2/package-summary.html
  
   Regards
   Jitendra
  
   On Fri, Aug 30, 2013 at 11:35 AM, lei liu liulei...@gmail.com
 wrote:
  
   I use the metrics v2, there are COUNTER and GAUGE metric type in
  metrics v2.
   What is the difference between the two?
  
   Thanks,
   LiuLei
  
  
  
 
 





Re: metric type

2013-09-01 Thread lei liu
Hi Jitendr, thanks for your reply.

If MutableCounterLong is uesed to IO/sec statistics, I think the value
of MutableCounterLong
should be divided by 10 and be reseted to zero per ten seconds in
MutableCounterLong.snapshot
method, is that right? But MutableCounterLong.snapshot method don't do
that. I missed anything please tell me. Looking forward to your reply.

Thanks,
LiuLei


2013/9/1 Jitendra Yadav jeetuyadav200...@gmail.com

 Yes, MutableCounterLong helps to gather DataNode read/write statics.
 There is more option available within this metric

 Regards
 Jitendra
 On 8/31/13, lei liu liulei...@gmail.com wrote:
  There is @Metric MutableCounterLong bytesWritten attribute in
  DataNodeMetrics, which is used to IO/sec statistics?
 
 
  2013/8/31 Jitendra Yadav jeetuyadav200...@gmail.com
 
  Hi,
 
  For IO/sec statistics I think MutableCounterLongRate  and
  MutableCounterLong more useful than others and for xceiver thread
  number I'm not bit sure right now.
 
  Thanks
  Jiitendra
  On Fri, Aug 30, 2013 at 1:40 PM, lei liu liulei...@gmail.com wrote:
  
   Hi  Jitendra,
   If I want to statistics number of bytes read per second,and display
 the
  result into ganglia, should I use MutableCounterLong or
 MutableGaugeLong?
  
   If I want to display current xceiver thread number in datanode into
  ganglia, should I use MutableCounterLong or MutableGaugeLong?
  
   Thanks,
   LiuLei
  
  
   2013/8/30 Jitendra Yadav jeetuyadav200...@gmail.com
  
   Hi,
  
   Below link contains the answer for your question.
  
  
 
 http://hadoop.apache.org/docs/r1.2.0/api/org/apache/hadoop/metrics2/package-summary.html
  
   Regards
   Jitendra
  
   On Fri, Aug 30, 2013 at 11:35 AM, lei liu liulei...@gmail.com
 wrote:
  
   I use the metrics v2, there are COUNTER and GAUGE metric type in
  metrics v2.
   What is the difference between the two?
  
   Thanks,
   LiuLei
  
  
  
 
 



Re: metric type

2013-08-31 Thread lei liu
There is @Metric MutableCounterLong bytesWritten attribute in
DataNodeMetrics, which is used to IO/sec statistics?


2013/8/31 Jitendra Yadav jeetuyadav200...@gmail.com

 Hi,

 For IO/sec statistics I think MutableCounterLongRate  and
 MutableCounterLong more useful than others and for xceiver thread
 number I'm not bit sure right now.

 Thanks
 Jiitendra
 On Fri, Aug 30, 2013 at 1:40 PM, lei liu liulei...@gmail.com wrote:
 
  Hi  Jitendra,
  If I want to statistics number of bytes read per second,and display the
 result into ganglia, should I use MutableCounterLong or MutableGaugeLong?
 
  If I want to display current xceiver thread number in datanode into
 ganglia, should I use MutableCounterLong or MutableGaugeLong?
 
  Thanks,
  LiuLei
 
 
  2013/8/30 Jitendra Yadav jeetuyadav200...@gmail.com
 
  Hi,
 
  Below link contains the answer for your question.
 
 
 http://hadoop.apache.org/docs/r1.2.0/api/org/apache/hadoop/metrics2/package-summary.html
 
  Regards
  Jitendra
 
  On Fri, Aug 30, 2013 at 11:35 AM, lei liu liulei...@gmail.com wrote:
 
  I use the metrics v2, there are COUNTER and GAUGE metric type in
 metrics v2.
  What is the difference between the two?
 
  Thanks,
  LiuLei
 
 
 



namenode name dir

2013-08-30 Thread lei liu
I use QJM,  do I need to  config two  directories for the
dfs.namenode.name.dir,  one local filesystem path and one NFS path?

I think the Stadnby NameNode also store the fsimage, so I think  I only
need to config one local file system path.


Thanks,

LiuLei


metric type

2013-08-30 Thread lei liu
I use the metrics v2, there are COUNTER and GAUGE metric type in metrics
v2.
What is the difference between the two?

Thanks,
LiuLei


Re: metric type

2013-08-30 Thread lei liu
Hi  Jitendra,
If I want to statistics number of bytes read per second,and display the
result into ganglia, should I use MutableCounterLong or MutableGaugeLong?

If I want to display current xceiver thread number in datanode into ganglia,
should I use MutableCounterLong or MutableGaugeLong?

Thanks,
LiuLei


2013/8/30 Jitendra Yadav jeetuyadav200...@gmail.com

 Hi,

 Below link contains the answer for your question.


 http://hadoop.apache.org/docs/r1.2.0/api/org/apache/hadoop/metrics2/package-summary.html

 Regards
 Jitendra

 On Fri, Aug 30, 2013 at 11:35 AM, lei liu liulei...@gmail.com wrote:

 I use the metrics v2, there are COUNTER and GAUGE metric type in metrics
 v2.
  What is the difference between the two?

 Thanks,
 LiuLei





domain socket

2013-08-28 Thread lei liu
There are dfs.client.read.shortcircuit and
dfs.client.domain.socket.data.traffic configuration in domain socket. What
is different them?

Thanks,

LiuLei


hadoop2 and Hbase0.94

2013-08-28 Thread lei liu
I use hadoop2 and hbase0.94, but there is below exception:

2013-08-28 11:36:12,922 ERROR
[MASTER_TABLE_OPERATIONS-dw74.kgb.sqa.cm4,13646,1377660964832-0]
executor.EventHandler(172): Caught throwable while processing
event C_M_DELETE_TABLE
java.lang.IllegalArgumentException: Wrong FS: file:/tmp/
hbase-shenxiu.cx/hbase/observed_table/47b334989065a8ac84873e6d07c1de62,
expected: hdfs://localhost.lo
caldomain:35974
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:590)
at
org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:172)
at
org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:402)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1427)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1467)
at
org.apache.hadoop.hbase.util.FSUtils.listStatus(FSUtils.java:1052)
at
org.apache.hadoop.hbase.backup.HFileArchiver.archiveRegion(HFileArchiver.java:123)
at
org.apache.hadoop.hbase.backup.HFileArchiver.archiveRegion(HFileArchiver.java:72)
at
org.apache.hadoop.hbase.master.MasterFileSystem.deleteRegion(MasterFileSystem.java:444)
at
org.apache.hadoop.hbase.master.handler.DeleteTableHandler.handleTableOperation(DeleteTableHandler.java:73)
at
org.apache.hadoop.hbase.master.handler.TableEventHandler.process(TableEventHandler.java:96)
at
org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:169)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
2013-08-28 11:37:05,653 INFO
 [Master:0;dw74.kgb.sqa.cm4,13646,1377660964832.archivedHFileCleaner]
util.FSUtils(1055): hdfs://localhost.localdomain:35974/use


Re: hadoop2 and Hbase0.94

2013-08-28 Thread lei liu
When I run hbase unit test, there is the exception.


2013/8/28 Harsh J ha...@cloudera.com

 Moving to u...@hbase.apache.org.

 Please share your hbase-site.xml and core-site.xml. Was this HBase
 cluster previously running on a standalone local filesystem mode?

 On Wed, Aug 28, 2013 at 2:06 PM, lei liu liulei...@gmail.com wrote:
  I use hadoop2 and hbase0.94, but there is below exception:
 
  2013-08-28 11:36:12,922 ERROR
  [MASTER_TABLE_OPERATIONS-dw74.kgb.sqa.cm4,13646,1377660964832-0]
  executor.EventHandler(172): Caught throwable while processing
  event C_M_DELETE_TABLE
  java.lang.IllegalArgumentException: Wrong FS:
  file:/tmp/
 hbase-shenxiu.cx/hbase/observed_table/47b334989065a8ac84873e6d07c1de62,
  expected: hdfs://localhost.lo
  caldomain:35974
  at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:590)
  at
 
 org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:172)
  at
 
 org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:402)
  at
 org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1427)
  at
 org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1467)
  at
  org.apache.hadoop.hbase.util.FSUtils.listStatus(FSUtils.java:1052)
  at
 
 org.apache.hadoop.hbase.backup.HFileArchiver.archiveRegion(HFileArchiver.java:123)
  at
 
 org.apache.hadoop.hbase.backup.HFileArchiver.archiveRegion(HFileArchiver.java:72)
  at
 
 org.apache.hadoop.hbase.master.MasterFileSystem.deleteRegion(MasterFileSystem.java:444)
  at
 
 org.apache.hadoop.hbase.master.handler.DeleteTableHandler.handleTableOperation(DeleteTableHandler.java:73)
  at
 
 org.apache.hadoop.hbase.master.handler.TableEventHandler.process(TableEventHandler.java:96)
  at
  org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:169)
  at
 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
  at
 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
  at java.lang.Thread.run(Thread.java:662)
  2013-08-28 11:37:05,653 INFO
  [Master:0;dw74.kgb.sqa.cm4,13646,1377660964832.archivedHFileCleaner]
  util.FSUtils(1055): hdfs://localhost.localdomain:35974/use



 --
 Harsh J



Re: hadoop2 and Hbase0.94

2013-08-28 Thread lei liu
In org.apache.hadoop.hbase.coprocessor.TestMasterObserver unit test.


2013/8/28 lei liu liulei...@gmail.com

 When I run hbase unit test, there is the exception.


 2013/8/28 Harsh J ha...@cloudera.com

 Moving to u...@hbase.apache.org.

 Please share your hbase-site.xml and core-site.xml. Was this HBase
 cluster previously running on a standalone local filesystem mode?

 On Wed, Aug 28, 2013 at 2:06 PM, lei liu liulei...@gmail.com wrote:
  I use hadoop2 and hbase0.94, but there is below exception:
 
  2013-08-28 11:36:12,922 ERROR
  [MASTER_TABLE_OPERATIONS-dw74.kgb.sqa.cm4,13646,1377660964832-0]
  executor.EventHandler(172): Caught throwable while processing
  event C_M_DELETE_TABLE
  java.lang.IllegalArgumentException: Wrong FS:
  file:/tmp/
 hbase-shenxiu.cx/hbase/observed_table/47b334989065a8ac84873e6d07c1de62,
  expected: hdfs://localhost.lo
  caldomain:35974
  at
 org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:590)
  at
 
 org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:172)
  at
 
 org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:402)
  at
 org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1427)
  at
 org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1467)
  at
  org.apache.hadoop.hbase.util.FSUtils.listStatus(FSUtils.java:1052)
  at
 
 org.apache.hadoop.hbase.backup.HFileArchiver.archiveRegion(HFileArchiver.java:123)
  at
 
 org.apache.hadoop.hbase.backup.HFileArchiver.archiveRegion(HFileArchiver.java:72)
  at
 
 org.apache.hadoop.hbase.master.MasterFileSystem.deleteRegion(MasterFileSystem.java:444)
  at
 
 org.apache.hadoop.hbase.master.handler.DeleteTableHandler.handleTableOperation(DeleteTableHandler.java:73)
  at
 
 org.apache.hadoop.hbase.master.handler.TableEventHandler.process(TableEventHandler.java:96)
  at
  org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:169)
  at
 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
  at
 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
  at java.lang.Thread.run(Thread.java:662)
  2013-08-28 11:37:05,653 INFO
  [Master:0;dw74.kgb.sqa.cm4,13646,1377660964832.archivedHFileCleaner]
  util.FSUtils(1055): hdfs://localhost.localdomain:35974/use



 --
 Harsh J





Re: when Standby Namenode is doing checkpoint, the Active NameNode is slow.

2013-08-15 Thread lei liu
Hi Jitendra,

I don't use the compression parameter.

My network card is 100M/s, and  I set the dfs.image.transfer.bandwidthPerSec.
to 50M, so I think Active NameNode  still has 50M bandwidth to be used to
handle RPC request, why the OPS dropped by 50%?


2013/8/15 Jitendra Yadav jeetuyadav200...@gmail.com

 Hi,
 Looks like you got some pace, did you also tried with compression
 parameter? I think you will get more optimization with it. Also file
 transfer speed depends on our network bandwidth between PNN/SNN and network
 traffic b/w nodes.What's your network conf?
 Thanks


 On Wed, Aug 14, 2013 at 11:39 AM, lei liu liulei...@gmail.com wrote:

  I set the dfs.image.transfer.bandwidthPerSec. to 50M, and the
 performance is below:

 2013-08-14 12:32:33,079 INFO my.EditLogPerformance: totalCount:1342440
 speed:
 2013-08-14 12:32:43,082 INFO my.EditLogPerformance: totalCount:1363338
 speed:1044
 2013-08-14 12:32:53,085 INFO my.EditLogPerformance: totalCount:1385526
 speed:1109
 *2013-08-14 12:33:03,087 INFO my.EditLogPerformance: totalCount:1396324
 speed:539*
 *2013-08-14 12:33:13,090 INFO my.EditLogPerformance: totalCount:1406232
 speed:495
 2013-08-14 12:33:23,093 INFO my.EditLogPerformance: totalCount:1415006
 speed:438
 2013-08-14 12:33:33,096 INFO my.EditLogPerformance: totalCount:1423952
 speed:447*
 *2013-08-14 12:33:43,099 INFO my.EditLogPerformance: totalCount:1437256
 speed:665*
 2013-08-14 12:33:53,102 INFO my.EditLogPerformance: totalCount:1458378
 speed:1056
 2013-08-14 12:34:03,106 INFO my.EditLogPerformance: totalCount:1479338
 speed:1048
 2013-08-14 12:34:13,108 INFO my.EditLogPerformance: totalCount:1500400
 speed:1053
 2013-08-14 12:34:23,111 INFO my.EditLogPerformance: totalCount:1521252
 speed:1042
 2013-08-14 12:34:33,114 INFO my.EditLogPerformance: totalCount:1542286
 speed:1051
 2013-08-14 12:34:43,117 INFO my.EditLogPerformance: totalCount:1562956
 speed:1033
 2013-08-14 12:34:53,120 INFO my.EditLogPerformance: totalCount:1583804
 speed:1042
 2013-08-14 12:35:03,123 INFO my.EditLogPerformance: totalCount:1606558
 speed:1137
 2013-08-14 12:35:13,126 INFO my.EditLogPerformance: totalCount:1627980
 speed:1071
 2013-08-14 12:35:23,129 INFO my.EditLogPerformance: totalCount:1650642
 speed:1133
 2013-08-14 12:35:33,132 INFO my.EditLogPerformance: totalCount:1672806
 speed:1108
 2013-08-14 12:35:43,134 INFO my.EditLogPerformance: totalCount:1693940
 speed:1056
 2013-08-14 12:35:53,137 INFO my.EditLogPerformance: totalCount:1715430
 speed:1074
 2013-08-14 12:36:03,140 INFO my.EditLogPerformance: totalCount:1737940
 speed:1125
 2013-08-14 12:36:13,143 INFO my.EditLogPerformance: totalCount:1760094
 speed:1107
 2013-08-14 12:36:23,146 INFO my.EditLogPerformance: totalCount:1781646
 speed:1077
 2013-08-14 12:36:33,149 INFO my.EditLogPerformance: totalCount:1802230
 speed:1029
 2013-08-14 12:36:43,152 INFO my.EditLogPerformance: totalCount:1824132
 speed:1095
 2013-08-14 12:36:53,155 INFO my.EditLogPerformance: totalCount:1846778
 speed:1132
 2013-08-14 12:37:03,158 INFO my.EditLogPerformance: totalCount:1868956
 speed:1108
 2013-08-14 12:37:13,161 INFO my.EditLogPerformance: totalCount:1888556
 speed:980
 2013-08-14 12:37:23,164 INFO my.EditLogPerformance: totalCount:1910512
 speed:1097
 2013-08-14 12:37:33,167 INFO my.EditLogPerformance: totalCount:1932240
 speed:1086
 2013-08-14 12:37:43,170 INFO my.EditLogPerformance: totalCount:1954226
 speed:1099
 2013-08-14 12:37:53,173 INFO my.EditLogPerformance: totalCount:1974706
 speed:1024
 2013-08-14 12:38:03,176 INFO my.EditLogPerformance: totalCount:1993906
 speed:960
 2013-08-14 12:38:13,179 INFO my.EditLogPerformance: totalCount:2014172
 speed:1013
 2013-08-14 12:38:23,182 INFO my.EditLogPerformance: totalCount:2036130
 speed:1097
 2013-08-14 12:38:33,184 INFO my.EditLogPerformance: totalCount:2057848
 speed:1085
 2013-08-14 12:38:43,187 INFO my.EditLogPerformance: totalCount:2078834
 speed:1049
 2013-08-14 12:38:53,190 INFO my.EditLogPerformance: totalCount:2095616
 speed:839
 *2013-08-14 12:39:03,193 INFO my.EditLogPerformance: totalCount:2104896
 speed:464
 2013-08-14 12:39:13,196 INFO my.EditLogPerformance: totalCount:2114572
 speed:483
 2013-08-14 12:39:23,199 INFO my.EditLogPerformance: totalCount:2123512
 speed:447*
 *2013-08-14 12:39:33,202 INFO my.EditLogPerformance: totalCount:2133604
 speed:504*
 2013-08-14 12:39:43,205 INFO my.EditLogPerformance: totalCount:2149792
 speed:809



 The there are below info in Active NameNode:
 2013-08-14 12:44:47,301 INFO
 org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Opening connection
 to
 http://dw78.kgb.sqa.cm4:20021/getimage?getimage=1txid=655178418storageInfo=-40:1499625118:0:CID-921af0aa-b831-4828-965c-3b71a5149600
 2013-08-14 12:48:57,529 INFO
 org.apache.hadoop.hdfs.server.namenode.TransferFsImage: *Transfer took
 250.23s at 10280.59 KB/s*
 2013-08-14 12:48:57,530 INFO
 org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Downloaded file
 fsimage.ckpt_00655178418 size

dynamic configuration

2013-08-14 Thread lei liu
There is ReconfigurationServlet class in hadoop-2.0.5.

How I to use the function for NameNode and DataNode?

Thanks,

LiuLei


Re: when Standby Namenode is doing checkpoint, the Active NameNode is slow.

2013-08-13 Thread lei liu
The fsimage file size is 1658934155


2013/8/13 Harsh J ha...@cloudera.com

 How large are your checkpointed fsimage files?

 On Mon, Aug 12, 2013 at 3:42 PM, lei liu liulei...@gmail.com wrote:
  When Standby Namenode is doing checkpoint,  upload the image file to
 Active
  NameNode, the Active NameNode is very slow. What is reason result to the
  Active NameNode is slow?
 
 
  Thanks,
 
  LiuLei
 



 --
 Harsh J



Re: when Standby Namenode is doing checkpoint, the Active NameNode is slow.

2013-08-13 Thread lei liu
  my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:11087546  speed:6
2013-08-13 17:49:51,599 INFO  my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:11087716  speed:8
2013-08-13 17:50:01,602 INFO  my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:11091608  speed:194



The speed is less than ten sometimes. I find when Active NameNode download
the fsimage file, the  speed is less than ten. So I think download fsimage
file that affects the performance  of Active NameNode.


There are below info in Standby NameNode:
2013-08-13 17:48:12,412 INFO
org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer: Triggering
checkpoint because there have been 2558038 txns since the last checkpoint,
which exceeds the configured threshold 100
2013-08-13 17:48:12,413 INFO
org.apache.hadoop.hdfs.server.namenode.FSImage: Saving image file
/home/musa.ll/hadoop2/cluster-data/name/current/fsimage.ckpt_00521186406
using no compression
2013-08-13 17:49:19,085 INFO
org.apache.hadoop.hdfs.server.namenode.FSImage: Image file of size
3385425100 saved in 66 seconds.
2013-08-13 17:49:19,655 INFO
org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Opening connection
to
http://10.232.98.77:20021/getimage?putimage=1txid=521186406port=20021storageInfo=-40:1499625118:0:CID-921af0aa-b831-4828-965c-3b71a5149600
2013-08-13 17:53:21,107 INFO
org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Transfer took
241.45s at 0.00 KB/s
2013-08-13 17:53:21,107 INFO
org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Uploaded image with
txid 521186406 to namenode at 10.232.98.77:20021


There are below info in Active NameNode:
2013-08-13 17:49:19,659 INFO
org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Opening connection
to
http://dw78.kgb.sqa.cm4:20021/getimage?getimage=1txid=521186406storageInfo=-40:1499625118:0:CID-921af0aa-b831-4828-965c-3b71a5149600
2013-08-13 17:53:20,610 INFO
org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Transfer took
240.95s at 13720.96 KB/s
2013-08-13 17:53:20,610 INFO
org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Downloaded file
fsimage.ckpt_00521186406 size 3385425100 bytes.












2013/8/13 Jitendra Yadav jeetuyadav200...@gmail.com

 Hi,

 Can you please let me know that how you identified the slowness between
 primary and standby namnode?

 Also please share the network connection bandwidth between these two
 servers.

 Thanks

 On Tue, Aug 13, 2013 at 11:52 AM, lei liu liulei...@gmail.com wrote:

 The fsimage file size is 1658934155


 2013/8/13 Harsh J ha...@cloudera.com

 How large are your checkpointed fsimage files?

 On Mon, Aug 12, 2013 at 3:42 PM, lei liu liulei...@gmail.com wrote:
  When Standby Namenode is doing checkpoint,  upload the image file to
 Active
  NameNode, the Active NameNode is very slow. What is reason result to
 the
  Active NameNode is slow?
 
 
  Thanks,
 
  LiuLei
 



 --
 Harsh J






EditLogPerformance.java
Description: Binary data


when Standby Namenode is doing checkpoint, the Active NameNode is slow.

2013-08-12 Thread lei liu
When Standby Namenode is doing checkpoint,  upload the image file to Active
NameNode, the Active NameNode is very slow. What is reason result to the
Active NameNode is slow?


Thanks,

LiuLei


Re: MutableCounterLong metrics display in ganglia

2013-08-10 Thread lei liu
Thanks Harsh for your reply.

What are difference MutableCounterLong and  MutableGaugeLong  class ?

I find the MutableCounterLong is used to calculate throughput, the value be
reset per ten seconds, and MutableGaugeLong is up-count and no reset.

I am newer for hadoop-2.0.5,  please tell me if there is an error.

Thanks,

LiuLei





2013/8/9 Harsh J ha...@cloudera.com

 The counter, being num-ops, should up-count and not reset. Note that
 your test may be at fault though - calling hsync may not always call
 NN#fsync(…) unless you are passing the proper flags to make it always
 do so.

 On Wed, Aug 7, 2013 at 4:27 PM, lei liu liulei...@gmail.com wrote:
  I use hadoop-2.0.5 and config hadoop-metrics2.properties file with below
  content.
  *.sink.ganglia.class=org.
  apache.hadoop.metrics2.sink.ganglia.GangliaSink31
  *.sink.ganglia.period=10
  *.sink.ganglia.supportsparse=true
  namenode.sink.ganglia.servers=10.232.98.74:8649
  datanode.sink.ganglia.servers=10.232.98.74:8649
 
  I write one programme that call FSDataOutputStream.hsync() method once
 per
  second.
 
  There is @Metric MutableCounterLong fsyncCount metrics in
 DataNodeMetrics,
  when FSDataOutputStream.hsync() method is called, the value of
  fsyncCount
  is increased, dataNode send the value of  fsyncCount to ganglia every ten
  seconds, so I think the value  of  fsyncCount in ganglia should be 10, 20
  ,30, 40 and so on .  but the ganglia display 1,1,1,1,1 .. , so the
 value
  is
  the value of fsyncCount is set to zero every ten seconds and
  ”fsyncCount.value/10“ .
 
 
  Is  the the value of MutableCounterLong class  set to zero every ten
 seconds
  and   MutableCounterLong .value/10?
 
  Thanks,
 
  LiuLei
 
 



 --
 Harsh J



MutableCounterLong and MutableCounterLong class difference in metrics v2

2013-08-08 Thread lei liu
I use hadoop-2.0.5, there are MutableCounterLong and  MutableCounterLong
class in metrics v2.

I am studing metrics v2 code.

What are difference MutableCounterLong and  MutableCounterLong class ?

I find the MutableCounterLong is used to calculate throughput,  is that
right?  How does the metrics v2 to handle MutableCounterLong  class?


Thanks,

LiuLei


MutableCounterLong metrics display in ganglia

2013-08-07 Thread lei liu
I use hadoop-2.0.5 and config hadoop-metrics2.properties file with below
content.
*.sink.ganglia.class=org.
apache.hadoop.metrics2.sink.ganglia.GangliaSink31
*.sink.ganglia.period=10
*.sink.ganglia.supportsparse=true
namenode.sink.ganglia.servers=10.232.98.74:8649
datanode.sink.ganglia.servers=10.232.98.74:8649

I write one programme that call FSDataOutputStream.hsync() method once per
second.

There is @Metric MutableCounterLong fsyncCount metrics in
DataNodeMetrics, when FSDataOutputStream.hsync() method is called, the
value of  fsyncCount is increased, dataNode send the value of  fsyncCount
to ganglia every ten seconds, so I think the value  of  fsyncCount in
ganglia should be 10, 20 ,30, 40 and so on .  but the ganglia display
1,1,1,1,1 .. , so the value is
the value of fsyncCount is set to zero every ten seconds and
”fsyncCount.value/10“ .


Is  the the value of MutableCounterLong class  set to zero every ten
seconds and   MutableCounterLong .value/10?

Thanks,

LiuLei


throughput metrics in hadoop-2.0.5

2013-08-06 Thread lei liu
I use hadoop-2.0.5 and config hadoop-metrics2.properties file with below
content.
*.sink.ganglia.class=org.apache.hadoop.metrics2.sink.ganglia.GangliaSink31
*.sink.ganglia.period=10
*.sink.ganglia.supportsparse=true
namenode.sink.ganglia.servers=10.232.98.74:8649
datanode.sink.ganglia.servers=10.232.98.74:8649

I write one programme that call
FSDataOutputStreameclipse-javadoc:%E2%98%82=hadoop-hdfs/C:%5C/Users%5C/musa.ll%5C/.m2%5C/repository%5C/org%5C/apache%5C/hadoop%5C/hadoop-common%5C/2.0.0-cdh4.3.0%5C/hadoop-common-2.0.0-cdh4.3.0.jar%3Corg.apache.hadoop.fs%28FSDataOutputStream.class%E2%98%83FSDataOutputStream.hsync()
method once per second.

There is @Metric MutableCounterLong fsyncCount metrics in
DataNodeMetrics, the MutableCounterLong class continuously increase the
value, so I think the value in ganglia should be 10, 20 ,30, 40 and so on.
but  the value in ganglia is below:


[image: dw62.kgb.sqa.cm4 dfs.datanode.FsyncCount]


I want to know the ganglia how to display the value of MutableCounterLong
class?

Thanks,

LiuLei


MutableRate metrics in hadoop-2.0.5

2013-08-06 Thread lei liu
There is code in MutableRate  class:

 public synchronized void snapshot(MetricsRecordBuilder builder, boolean
all) {
if (all || changed()) {
  numSamples += intervalStat.numSamples();
  builder.addCounter(numInfo, numSamples)
 .addGauge(avgInfo, lastStat().mean());
  *if (extended)* {
builder.addGauge(stdevInfo, lastStat().stddev())
   .addGauge(iMinInfo, lastStat().min())
   .addGauge(iMaxInfo, lastStat().max())
   .addGauge(minInfo, minMax.min())
   .addGauge(maxInfo, minMax.max());
  }
  if (changed()) {
if (numSamples  0) {
  intervalStat.copyTo(prevStat);
  intervalStat.reset();
}
clearChanged();
  }
}
  }

How can I set the extended variable to true?

Thanks,

LiuLei


Re: throughput metrics in hadoop-2.0.5

2013-08-06 Thread lei liu
There is @Metric MutableCounterLong fsyncCount metrics in
DataNodeMetrics, the MutableCounterLong class continuously increase the
value, so I think the value in ganglia should be 10, 20 ,30, 40 and so
on.  but  the value the value is fsyncCount.value/10, that is in  1 ,1 , 1
, 1  in ganglia.

 How does ganglia  to display the value of MutableCounterLong class? Is
that fsyncCount.value or fsyncCount.value/10?




2013/8/6 lei liu liulei...@gmail.com

 I use hadoop-2.0.5 and config hadoop-metrics2.properties file with below
 content.
 *.sink.ganglia.class=org.apache.hadoop.metrics2.sink.ganglia.GangliaSink31
 *.sink.ganglia.period=10
 *.sink.ganglia.supportsparse=true
 namenode.sink.ganglia.servers=10.232.98.74:8649
 datanode.sink.ganglia.servers=10.232.98.74:8649

 I write one programme that call FSDataOutputStream.hsync() method once
 per second.

 There is @Metric MutableCounterLong fsyncCount metrics in
 DataNodeMetrics, the MutableCounterLong class continuously increase the
 value, so I think the value in ganglia should be 10, 20 ,30, 40 and so on.
 but  the value in ganglia is below:


 [image: dw62.kgb.sqa.cm4 dfs.datanode.FsyncCount]


 I want to know the ganglia how to display the value of MutableCounterLong
 class?

 Thanks,

 LiuLei



Re: throughput metrics in hadoop-2.0.5

2013-08-06 Thread lei liu
Is  the the value of MutableCounterLong class  set to zreo per 10 seconds?


2013/8/6 lei liu liulei...@gmail.com

 There is @Metric MutableCounterLong fsyncCount metrics in
 DataNodeMetrics, the MutableCounterLong class continuously increase the
 value, so I think the value in ganglia should be 10, 20 ,30, 40 and so
 on.  but  the value the value is fsyncCount.value/10, that is in  1 ,1 , 1
 , 1  in ganglia.

  How does ganglia  to display the value of MutableCounterLong class? Is
 that fsyncCount.value or fsyncCount.value/10?




 2013/8/6 lei liu liulei...@gmail.com

 I use hadoop-2.0.5 and config hadoop-metrics2.properties file with below
 content.
 *.sink.ganglia.class=org.apache.hadoop.metrics2.sink.ganglia.GangliaSink31
 *.sink.ganglia.period=10
 *.sink.ganglia.supportsparse=true
 namenode.sink.ganglia.servers=10.232.98.74:8649
 datanode.sink.ganglia.servers=10.232.98.74:8649

 I write one programme that call FSDataOutputStream.hsync() method once
 per second.

 There is @Metric MutableCounterLong fsyncCount metrics in
 DataNodeMetrics, the MutableCounterLong class continuously increase the
 value, so I think the value in ganglia should be 10, 20 ,30, 40 and so on.
 but  the value in ganglia is below:


 [image: dw62.kgb.sqa.cm4 dfs.datanode.FsyncCount]


 I want to know the ganglia how to display the value of MutableCounterLong
 class?

 Thanks,

 LiuLei





Re: throughput metrics in hadoop-2.0.5

2013-08-06 Thread lei liu
Is  the the value of MutableCounterLong class  set to zero per 10 seconds?


2013/8/6 lei liu liulei...@gmail.com

 Is  the the value of MutableCounterLong class  set to zreo per 10 seconds?


 2013/8/6 lei liu liulei...@gmail.com

 There is @Metric MutableCounterLong fsyncCount metrics in
 DataNodeMetrics, the MutableCounterLong class continuously increase the
 value, so I think the value in ganglia should be 10, 20 ,30, 40 and so
 on.  but  the value the value is fsyncCount.value/10, that is in  1 ,1 , 1
 , 1  in ganglia.

  How does ganglia  to display the value of MutableCounterLong class? Is
 that fsyncCount.value or fsyncCount.value/10?




 2013/8/6 lei liu liulei...@gmail.com

 I use hadoop-2.0.5 and config hadoop-metrics2.properties file with below
 content.

 *.sink.ganglia.class=org.apache.hadoop.metrics2.sink.ganglia.GangliaSink31
 *.sink.ganglia.period=10
 *.sink.ganglia.supportsparse=true
 namenode.sink.ganglia.servers=10.232.98.74:8649
 datanode.sink.ganglia.servers=10.232.98.74:8649

 I write one programme that call FSDataOutputStream.hsync() method once
 per second.

 There is @Metric MutableCounterLong fsyncCount metrics in
 DataNodeMetrics, the MutableCounterLong class continuously increase the
 value, so I think the value in ganglia should be 10, 20 ,30, 40 and so on.
 but  the value in ganglia is below:


 [image: dw62.kgb.sqa.cm4 dfs.datanode.FsyncCount]


 I want to know the ganglia how to display the value of
 MutableCounterLong class?

 Thanks,

 LiuLei






Re: metics v1 in hadoop-2.0.5

2013-08-05 Thread lei liu
There is hadoop-metrics.properties file  in etc/hadoop directory.
I config the file with below content:
 dfs.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
 dfs.period=10
 dfs.servers=dw74:8649

But the configuration does not work.

Do I only use metrics v2 in hadoop-2.0.5?


2013/8/5 lei liu liulei...@gmail.com

 Can I use metrics v1 in hadoop-2.0.5?

 Thanks,

 LiuLei



metics v1 in hadoop-2.0.5

2013-08-04 Thread lei liu
Can I use metrics v1 in hadoop-2.0.5?

Thanks,

LiuLei


Standby NameNode checkpoint exception

2013-08-01 Thread lei liu
I use hadoop-2.0.5, and QJM for HA.

When Standby NameNode do checkpoint,there are below exception  in Standby
NameNode:
2013-08-01 13:43:07,965 INFO
org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer: Triggering
checkpoint because there have been 763426 txns since the last checkpoint, wh
ich exceeds the configured threshold 4
2013-08-01 13:43:07,966 INFO
org.apache.hadoop.hdfs.server.namenode.FSImage: Saving image file
/home/musa.ll/hadoop2/cluster-data/name/current/fsimage.ckpt_00048708235
usi
ng no compression
2013-08-01 13:43:37,405 INFO
org.apache.hadoop.hdfs.server.namenode.FSImage: Image file of size
1504089705 saved in 29 seconds.
2013-08-01 13:43:37,410 INFO
org.apache.hadoop.hdfs.server.namenode.NNStorageRetentionManager: Going to
retain 2 images with txid = 47944809
2013-08-01 13:43:37,410 INFO
org.apache.hadoop.hdfs.server.namenode.NNStorageRetentionManager: Purging
old image FSImageFile(file=/home/musa.ll/hadoop2/cluster-data/name/current/f
simage_00047222679, cpktTxId=00047222679)
2013-08-01 13:43:37,723 WARN
org.apache.hadoop.hdfs.server.namenode.FSEditLog: Unable to determine input
streams from QJM to [10.232.98.61:20022, 10.232.98.62:20022, 10.232.98.63:
20022, 10.232.98.64:20022, 10.232.98.65:20022]. Skipping.
org.apache.hadoop.hdfs.qjournal.client.QuorumException: Got too many
exceptions to achieve quorum size 3/5. 4 exceptions thrown:
10.232.98.62:20022: Asked for firstTxId 46944810 which is in the middle of
file
/home/musa.ll/hadoop2/journal/mycluster/current/edits_00046630461-00047222679
at
org.apache.hadoop.hdfs.server.namenode.FileJournalManager.getRemoteEditLogs(FileJournalManager.java:183)
at
org.apache.hadoop.hdfs.qjournal.server.Journal.getEditLogManifest(Journal.java:628)
at
org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.getEditLogManifest(JournalNodeRpcServer.java:180)
at
org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.getEditLogManifest(QJournalProtocolServerSideTranslatorPB.java:203)
at
org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:14028)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:454)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1014)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1741)
hadoop-musa.ll-namenode-dw78.kgb.sqa.cm4.log 350842L,
60353971C
348726,1  99%
2013-08-01 14:28:07,051 INFO
org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Transfer took
26.08s at 0.00 KB/s
2013-08-01 14:28:07,051 INFO
org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Uploaded image with
txid 60835762 to namenode at 10.232.98.77:20021
2013-08-01 14:29:05,203 INFO
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Triggering log
roll on remote NameNode /10.232.98.77:20020
2013-08-01 14:29:06,242 INFO
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: replaying edit log:
137678/567332 transactions completed. (24%)
2013-08-01 14:29:07,243 INFO
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: replaying edit log:
275618/567332 transactions completed. (49%)
2013-08-01 14:29:08,244 INFO
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: replaying edit log:
407627/567332 transactions completed. (72%)
2013-08-01 14:29:09,245 INFO
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: replaying edit log:
545153/567332 transactions completed. (96%)
2013-08-01 14:29:20,146 INFO
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Loaded 567332
edits starting from txid 60835762
2013-08-01 14:30:44,411 INFO
org.apache.hadoop.hdfs.server.namenode.FSImage: Image file of size
1950604672 saved in 37 seconds.
2013-08-01 14:30:44,416 INFO
org.apache.hadoop.hdfs.server.namenode.NNStorageRetentionManager: Going to
retain 2 images with txid = 60835762
org.apache.hadoop.hdfs.qjournal.client.QuorumException: Got too many
exceptions to achieve quorum size 3/5. 4 exceptions thrown:
10.232.98.62:20022: Asked for firstTxId 59835763 which is in the middle of
file
/home/musa.ll/hadoop2/journal/mycluster/current/edits_00059678382-00060264590
at
org.apache.hadoop.hdfs.server.namenode.FileJournalManager.getRemoteEditLogs(FileJournalManager.java:183)
at
org.apache.hadoop.hdfs.qjournal.server.Journal.getEditLogManifest(Journal.java:628)
at
org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.getEditLogManifest(JournalNodeRpcServer.java:180)
at
org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.getEditLogManifest(QJournalProtocolServerSideTranslatorPB.java:203)
at
org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:14028)
at

Re: ./hdfs namenode -bootstrapStandby error

2013-07-20 Thread lei liu
Hi, Azuryy

Running the 'hdfs namenode -initializeSharedEdits'  command on the active
NN ,  I must to stop the Active NameNode.

I  think when  excuting the ./hdfs namenode -bootstrapStandby command on
Standby NameNode, the Active NameNode and JournalNameNodes should  be alive,
otherwise  there is no HA.

I was a beginner  for QJM , there is something wrong, please correct me.


Thanks,

LiuLei









2013/7/19 Azuryy Yu azury...@gmail.com

 hi,

 can you using
 'hdfs namenode -initializeSharedEdits' on the active NN, remember start
 all journal nodes before try this.
  On Jul 19, 2013 5:17 PM, lei liu liulei...@gmail.com wrote:

 I  use hadoop-2.0.5 version and use QJM for HA.


 I use ./hdfs namenode -bootstrapStandby for StandbyNameNode, but report
 below error:

 =
 About to bootstrap Standby ID nn2 from:
Nameservice ID: mycluster
 Other Namenode ID: nn1
   Other NN's HTTP address: 10.232.98.77:20021
   Other NN's IPC  address: dw77.kgb.sqa.cm4/10.232.98.77:20020
  Namespace ID: 1499625118
 Block pool ID: BP-2012507965-10.232.98.77-1372993302021
Cluster ID: CID-921af0aa-b831-4828-965c-3b71a5149600
Layout version: -40
 =
 Re-format filesystem in Storage Directory
 /home/musa.ll/hadoop2/cluster-data/name ? (Y or N) Y
 13/07/19 17:04:28 INFO common.Storage: Storage directory
 /home/musa.ll/hadoop2/cluster-data/name has been successfully formatted.
 13/07/19 17:04:29 FATAL ha.BootstrapStandby: Unable to read transaction
 ids 16317-16337 from the configured shared edits storage qjournal://
 10.232.98.61:20022;10.232.98.62:20022;10.232.98.63:20022/mycluster.
 Please copy these logs into the shared edits storage or call saveNamespace
 on the active node.
 Error: Gap in transactions. Expected to be able to read up until at least
 txid 16337 but unable to find any edit logs containing txid 16331
 13/07/19 17:04:29 INFO util.ExitUtil: Exiting with status 6



 The edit logs are below content in JournalNode:
 -rw-r--r-- 1 musa.ll users  30 Jul 19 15:51
 edits_0016327-0016328
 -rw-r--r-- 1 musa.ll users  30 Jul 19 15:53
 edits_0016329-0016330
 -rw-r--r-- 1 musa.ll users 1048576 Jul 19 17:03
 edits_inprogress_0016331


 The edits_inprogress_0016331 should contains the 16331-16337
 transactions, why the ./hdfs namenode -bootstrapStandby command report
 error? How can I initialize the StandbyNameNode?

 Thanks,

 LiuLei














./hdfs namenode -bootstrapStandby error

2013-07-19 Thread lei liu
I  use hadoop-2.0.5 version and use QJM for HA.


I use ./hdfs namenode -bootstrapStandby for StandbyNameNode, but report
below error:

=
About to bootstrap Standby ID nn2 from:
   Nameservice ID: mycluster
Other Namenode ID: nn1
  Other NN's HTTP address: 10.232.98.77:20021
  Other NN's IPC  address: dw77.kgb.sqa.cm4/10.232.98.77:20020
 Namespace ID: 1499625118
Block pool ID: BP-2012507965-10.232.98.77-1372993302021
   Cluster ID: CID-921af0aa-b831-4828-965c-3b71a5149600
   Layout version: -40
=
Re-format filesystem in Storage Directory
/home/musa.ll/hadoop2/cluster-data/name ? (Y or N) Y
13/07/19 17:04:28 INFO common.Storage: Storage directory
/home/musa.ll/hadoop2/cluster-data/name has been successfully formatted.
13/07/19 17:04:29 FATAL ha.BootstrapStandby: Unable to read transaction ids
16317-16337 from the configured shared edits storage
qjournal://10.232.98.61:20022;10.232.98.62:20022;
10.232.98.63:20022/mycluster. Please copy these logs into the shared edits
storage or call saveNamespace on the active node.
Error: Gap in transactions. Expected to be able to read up until at least
txid 16337 but unable to find any edit logs containing txid 16331
13/07/19 17:04:29 INFO util.ExitUtil: Exiting with status 6



The edit logs are below content in JournalNode:
-rw-r--r-- 1 musa.ll users  30 Jul 19 15:51
edits_0016327-0016328
-rw-r--r-- 1 musa.ll users  30 Jul 19 15:53
edits_0016329-0016330
-rw-r--r-- 1 musa.ll users 1048576 Jul 19 17:03
edits_inprogress_0016331


The edits_inprogress_0016331 should contains the 16331-16337
transactions, why the ./hdfs namenode -bootstrapStandby command report
error? How can I initialize the StandbyNameNode?

Thanks,

LiuLei


QJM and dfs.namenode.edits.dir

2013-07-17 Thread lei liu
When I use QJM for HA, do I need to save edit log  on the local filesystem?

I think the QJM is high availability for edit log, so I don't need to
configuration the dfs.namenode.edits.dir.


Thanks,

LiuLei


QJM for federation

2013-07-17 Thread lei liu
I have two namespaces, example below:

property
namedfs.nameservices/name
valuens1,ns2/value
 /property

Can I config the dfs.namenode.shared.edits.dir to below content?
property
  namedfs.namenode.shared.edits.dir/name
  
valueqjournal://10.232.98.61:20022;10.232.98.62:20022;10.232.98.63:20022/nn1,nn2/value
/property


Thanks,

LiuLei


Re: QJM for federation

2013-07-17 Thread lei liu
Thanks Harsh.




2013/7/17 Harsh J ha...@cloudera.com

 This has been asked previously. Use suffixes to solve your issue. See
 http://search-hadoop.com/m/Fingkg6Dk91

 On Wed, Jul 17, 2013 at 1:33 PM, lei liu liulei...@gmail.com wrote:
  I have two namespaces, example below:
 
  property
  namedfs.nameservices/name
  valuens1,ns2/value
   /property
 
  Can I config the dfs.namenode.shared.edits.dir to below content?
  property
namedfs.namenode.shared.edits.dir/name
 
  valueqjournal://10.232.98.61:20022;10.232.98.62:20022;
 10.232.98.63:20022/nn1,nn2/value
 
  /property
 
 
  Thanks,
 
  LiuLei
 
 
 
 
 



 --
 Harsh J



Re: QJM for federation

2013-07-17 Thread lei liu
I have another question for QJM.

If I use QJM for HA, do I need to save edit log  on the local filesystem?

I think the QJM is high availability for edit log, so I don't need to
config the dfs.namenode.edits.dir.


Thanks,

LiuLei


2013/7/17 lei liu liulei...@gmail.com

 Thanks Harsh.




 2013/7/17 Harsh J ha...@cloudera.com

 This has been asked previously. Use suffixes to solve your issue. See
 http://search-hadoop.com/m/Fingkg6Dk91

 On Wed, Jul 17, 2013 at 1:33 PM, lei liu liulei...@gmail.com wrote:
  I have two namespaces, example below:
 
  property
  namedfs.nameservices/name
  valuens1,ns2/value
   /property
 
  Can I config the dfs.namenode.shared.edits.dir to below content?
  property
namedfs.namenode.shared.edits.dir/name
 
  valueqjournal://10.232.98.61:20022;10.232.98.62:20022;
 10.232.98.63:20022/nn1,nn2/value
 
  /property
 
 
  Thanks,
 
  LiuLei
 
 
 
 
 



 --
 Harsh J





block over-replicated

2013-04-11 Thread lei liu
I use hadoop-2.0.3. I find when on block is over-replicated, the replicas
to be add to excessReplicateMap attribute of Blockmanager. But when the
block is deleted or the block has the intended number of replicas, the
replicas is not deleted form excessReplicateMap attribute.

I think this is bug.  If my understand may be wrong, please anyboy tell me
when to delete replicas form excessReplicateMap attribute?


Re: DFSOutputStream.sync() method latency time

2013-03-29 Thread lei liu
The sync method include below code:
  // Flush only if we haven't already flushed till this offset.
  if (lastFlushOffset != bytesCurBlock) {
assert bytesCurBlock  lastFlushOffset;
// record the valid offset of this flush
lastFlushOffset = bytesCurBlock;
enqueueCurrentPacket();
}


When there are 64k data in memory, the write method call
enqueueCurrentPacket method send one package to pipeline.  But when the
data in memory are less than 64K, the write method don't call
enqueueCurrentPacket method, so the write method don't send data to
pipeline, and then client call sync method, the sync method call
enqueueCurrentPacket method send data to pipeline, and wait ack info.





2013/3/29 Yanbo Liang yanboha...@gmail.com

 The write method write data to memory of client, the sync method send
 package to pipeline I thin you made a mistake for understanding the write
 procedure of HDFS.

 It's right that the write method write data to memory of client, however
 the data in the client memory is sent to DataNodes at the time when it was
 filled to the client memory. This procedure is finished by another thread,
 so it's concurrent operation.

 sync method has the same operation except for it is used for the last
 packet in the stream. It waits until have received ack from DataNodes.

 The write method and sync method is not concurrent. The write method or
 sync method is concurrent with the backend thread which is used to transfer
 data to DataNodes.

 And I guess you can understand Chinese, so I recommend you to read one of
 my blog(http://yanbohappy.sinaapp.com/?p=143) and it explain the write
 workflow detail.


 2013/3/29 lei liu liulei...@gmail.com

 Thanks Yanbo for your reply.

 I  test code are :
 FSDataOutputStream outputStream = fs.create(path);
 Random r = new Random();
 long totalBytes = 0;
 String str =  new String(new byte[1024]);
 while(totalBytes  1024 * 1024 * 500) {
   byte[] bytes = (start_+r.nextLong() +_ + str +
 r.nextLong()+_end + \n).getBytes();
   outputStream.write(bytes);
   outputStream.sync();
   totalBytes = totalBytes + bytes.length;
 }
 outputStream.close();


 The write method and sync method is synchronized, so the two method is
 not cocurrent.

 The write method write data to memory of client, the sync method send
 package to pipelien,  client can execute write  method  until the  sync
 method return sucess,  so I  think the sync method latency time should be
 equal with superposition of each datanode operation.




 2013/3/28 Yanbo Liang yanboha...@gmail.com

 1st when client wants to write data to HDFS, it should be create
 DFSOutputStream.
 Then the client write data to this output stream and this stream will
 transfer data to all DataNodes with the constructed pipeline by the means
 of Packet whose size is 64KB.
 These two operations is concurrent, so the write latency is not simple
 superposition.

 2nd the sync method only flush the last packet ( at most 64KB ) data to
 the pipeline.

 Because of the cocurrent processing of all these operations, so the
 latency is smaller than the superposition of each operation.
 It's parallel computing rather than serial computing in a sense.


 2013/3/28 lei liu liulei...@gmail.com

 When client  write data, if there are three replicates,  the sync
 method latency time formula should be:
 sync method  latency time = first datanode receive data time + sencond
 datanode receive data  time +  third datanode receive data time.

 if the three datanode receive data time all are 2 millisecond, so the
 sync method  latency time should is 6 millisecond,  but according to our
 our monitor, the the sync method  latency time is 2 millisecond.


 How to calculate sync method  latency time?


 Thanks,

 LiuLei







DFSOutputStream.sync() method latency time

2013-03-28 Thread lei liu
When client  write data, if there are three replicates,  the sync method
latency time formula should be:
sync method  latency time = first datanode receive data time + sencond
datanode receive data  time +  third datanode receive data time.

if the three datanode receive data time all are 2 millisecond, so the sync
method  latency time should is 6 millisecond,  but according to our our
monitor, the the sync method  latency time is 2 millisecond.


How to calculate sync method  latency time?


Thanks,

LiuLei


Re: DFSOutputStream.sync() method latency time

2013-03-28 Thread lei liu
Thanks Yanbo for your reply.

I  test code are :
FSDataOutputStream outputStream = fs.create(path);
Random r = new Random();
long totalBytes = 0;
String str =  new String(new byte[1024]);
while(totalBytes  1024 * 1024 * 500) {
  byte[] bytes = (start_+r.nextLong() +_ + str +
r.nextLong()+_end + \n).getBytes();
  outputStream.write(bytes);
  outputStream.sync();
  totalBytes = totalBytes + bytes.length;
}
outputStream.close();


The write method and sync method is synchronized, so the two method is not
cocurrent.

The write method write data to memory of client, the sync method send
package to pipelien,  client can execute write  method  until the  sync
method return sucess,  so I  think the sync method latency time should be
equal with superposition of each datanode operation.




2013/3/28 Yanbo Liang yanboha...@gmail.com

 1st when client wants to write data to HDFS, it should be create
 DFSOutputStream.
 Then the client write data to this output stream and this stream will
 transfer data to all DataNodes with the constructed pipeline by the means
 of Packet whose size is 64KB.
 These two operations is concurrent, so the write latency is not simple
 superposition.

 2nd the sync method only flush the last packet ( at most 64KB ) data to
 the pipeline.

 Because of the cocurrent processing of all these operations, so the
 latency is smaller than the superposition of each operation.
 It's parallel computing rather than serial computing in a sense.


 2013/3/28 lei liu liulei...@gmail.com

 When client  write data, if there are three replicates,  the sync method
 latency time formula should be:
 sync method  latency time = first datanode receive data time + sencond
 datanode receive data  time +  third datanode receive data time.

 if the three datanode receive data time all are 2 millisecond, so the
 sync method  latency time should is 6 millisecond,  but according to our
 our monitor, the the sync method  latency time is 2 millisecond.


 How to calculate sync method  latency time?


 Thanks,

 LiuLei





same edits file is loaded more than once

2012-11-04 Thread lei liu
I am using hadoop0.20.2, now I want to use HDFS HA function.  I research
AvatarNode. I find if the StandbyNN do checkpoint fail, when next time the
StandbyNN do checkpoint, the same edits file is loaded again.  Can same
edits file be loaded more than once in hadoop0.20.2?
if not, what is the harm?

Thanks,

LiuLei


Re: ClientProtocol create、mkdirs 、rename and delete methods are not Idempotent

2012-11-04 Thread lei liu
I want to know what applications are idempotent or not idempotent? and
Why? Could you give me a example.

Thank you


2012/10/29 Ted Dunning tdunn...@maprtech.com

 Create cannot be idempotent because of the problem of watches and
 sequential files.

 Similarly, mkdirs, rename and delete cannot generally be idempotent.  In
 particular applications, you might find it is OK to treat them as such, but
 there are definitely applications where they are not idempotent.


 On Sun, Oct 28, 2012 at 2:40 AM, lei liu liulei...@gmail.com wrote:

 I think these methods should are idempotent, these methods should be repeated
 calls to be harmless by same client.


 Thanks,

 LiuLei





Re: ClientProtocol create、mkdirs 、rename and delete methods are not Idempotent

2012-11-04 Thread lei liu
Hi Steve,

Thank you for your detailed and patiently  answered.  I understand that.


2012/11/5 Steve Loughran ste...@hortonworks.com



 On 4 November 2012 17:25, lei liu liulei...@gmail.com wrote:

 I want to know what applications are idempotent or not idempotent? and
 Why? Could you give me a example.




 When you say idempotent, I presume you mean the operation happens
 at-most-once; ignoring the degenerate case where all requests are
 rejected.

 you can take operations that fail if their conditions aren't met (delete
 path named=something) being the simplest. the operation can send an error
 back file not found', but the client library can then downgrade that to an
 idempotent assertion: when the acknowledgment was send from the namenode,
 there was nothing at the end of this path. Which will hold on a replay,
 though if someone creates a file in between, that replay could be
 observable.


 Now what about move(src,dest)?

 if it succeeds, then there is no src path, as it is now at dest.

 What happens if you call it a second time? There is no src, only dest. You
 can't report that back as a success as it is clearly a failure: no src, no
 dest. It's hard to convert that into an assertion on the observable state
 of the system as the state doesn't reflect the history, so you need some
 temporal logic in there too:: at time t0 there existed a directory src, at
 time t1 the directory src no longer existed and its contents were now found
 under directory dest.

 And again, what happens if worse someone else did something in between,
 created a src directory (which it could do, given that the first one has
 been renamed dest), the operation replays and the move takes place twice
 -you've just crossed into at-least-once operations, which is not what you
 wanted.


 At this point I'm sure you are thinking of having some kind of transaction
 journal, recording that at time Tn, transaction Xn moved the dir. Which
 means you have to start to collect a transaction log of what happened. Now
 effectively HDFS is a journalled file system, it does record a lot of
 things. It just doesn't record user transactions with it, or rescan the log
 whenever any operation comes in, so as to decided what to ignore.

 Or you just skip the filesystem changes and have some data structure
 recording recent transaction IDs; ignore repeated requests with the same
 IDs. Better, though you'd need to make that failure resistant -it's state
 must propagate to the journal and any failover namenodes so that a
 transaction replay will be idempotent even if the filesystem fails over
 between the original and replayed transaction. And of course all of this
 needs to be atomic with the filesystem state changes...

 Summary: It gets complicated fast. Throwing errors back to the caller
 makes life a lot simpler and lets the caller choose its own outcome -even
 though that's not always satisfactory.

 Alternatively: it's not that people don't want globally distributed
 transactions -it's just hard.







 2012/10/29 Ted Dunning tdunn...@maprtech.com

 Create cannot be idempotent because of the problem of watches and
 sequential files.

 Similarly, mkdirs, rename and delete cannot generally be idempotent.  In
 particular applications, you might find it is OK to treat them as such, but
 there are definitely applications where they are not idempotent.


 On Sun, Oct 28, 2012 at 2:40 AM, lei liu liulei...@gmail.com wrote:

 I think these methods should are idempotent, these methods should be 
 repeated
 calls to be harmless by same client.


 Thanks,

 LiuLei







ClientProtocol create、mkdirs 、rename and delete methods are not Idempotent

2012-10-28 Thread lei liu
I think these methods should are idempotent, these methods should be repeated
calls to be harmless by same client.


Thanks,

LiuLei


Re: ClientProtocol create、mkdirs 、rename and delete methods are not Idempotent

2012-10-28 Thread lei liu
Thanks Ted for your reply.

What is the the problem of watches and sequential files?  If you can
describe in detail, I can better understand the problem.

2012/10/29 Ted Dunning tdunn...@maprtech.com

 Create cannot be idempotent because of the problem of watches and
 sequential files.

 Similarly, mkdirs, rename and delete cannot generally be idempotent.  In
 particular applications, you might find it is OK to treat them as such, but
 there are definitely applications where they are not idempotent.


 On Sun, Oct 28, 2012 at 2:40 AM, lei liu liulei...@gmail.com wrote:

 I think these methods should are idempotent, these methods should be repeated
 calls to be harmless by same client.


 Thanks,

 LiuLei





Re: HDFS HA IO Fencing

2012-10-27 Thread lei liu
I use NFS V4 to test the java  FileLock.

The 192.168.1.233 machine is NFS Server,  the nfs configuration  are
/home/hdfs.ha/share  192.168.1.221(rw,sync,no_root_squash)
/home/hdfs.ha/share  192.168.1.222(rw,sync,no_root_squash)
in /etc/exports file.

I run below commands to start nfs server:
service nfs start
service nfslock start

The 192.168.1.221 and 192.168.1.222 machines are NFS Client, the nfs
configuration is
192.168.1.223:/home/hdfs.ha/share /home/hdfs.ha/share  nfs
rsize=8192,wsize=8192,timeo=14,intr in /etc/fstab file.

I run below commands to start nfs client:
service nfs start
service nfslock start

I write one programm to receive file lock:
public class FileLockTest {
 FileLock lock;

   public void lock(String path,boolean isShare) throws IOException {
   this.lock = tryLock(path,isShare);
   if (lock == null) {
 String msg = Cannot lock storage  + path
 + . The directory is already locked.;
System.out.println(msg);
 throw new IOException(msg);
   }
 }

private FileLock tryLock(String path,boolean isShare) throws
IOException {
boolean deletionHookAdded = false;
File lockF = new File(path);
if (!lockF.exists()) {
  lockF.deleteOnExit();
  deletionHookAdded = true;
}
RandomAccessFile file = new RandomAccessFile(lockF, rws);
FileLock res = null;
try {
  res = file.getChannel().tryLock(0,Long.MAX_VALUE,isShare);
} catch (OverlappingFileLockException oe) {
  file.close();
  return null;
} catch (IOException e) {
  e.printStackTrace();
  file.close();
  throw e;
}
if (res != null  !deletionHookAdded) {
  // If the file existed prior to our startup, we didn't
  // call deleteOnExit above. But since we successfully locked
  // the dir, we can take care of cleaning it up.
  lockF.deleteOnExit();
}
return res;
  }

public static void main(String[] s) {
 FileLockTest fileLockTest =new FileLockTest();
 try {
  fileLockTest.lock(s[0], Boolean.valueOf(s[1]));

  Thread.sleep(1000*60*60*1);
 } catch (Exception e) {
  // TODO Auto-generated catch block
  e.printStackTrace();
 }
}
}

I do two test cases.

1. The network is OK
I run  java -cp ./filelock.jar lock.FileLockTest
/home/hdfs.ha/share/test.lock false command in 192.168.1.221 to hold file
lock, and then I run same command to hold same file lock in 192.168.1.222,
throw below exception:
Cannot lock storage /home/hdfs.ha/share/test.lock. The directory is already
locked.
java.io.IOException: Cannot lock storage /home/hdfs.ha/share/test.lock. The
directory is already locked.
at lock.FileLockTest.lock(FileLockTest.java:18)
at lock.FileLockTest.main(FileLockTest.java:53)

2. machine which hold file lock is diconnected
I run  java -cp ./filelock.jar lock.FileLockTest
/home/hdfs.ha/share/test.lock false command on 192.168.1.221,  then
192.168.1.221  machine is disconnected from network . After three minutes ,
I run the   java -cp ./filelock.jar lock.FileLockTest
/home/hdfs.ha/share/test.lock false command on 192.168.1.222, that can
hold the file lock.
 I use mount | grep nfs command to examine the mount nfs directory on
192.168.1.221, the share directory /home/hdfs.ha/share/ is disappear on
192.168.1.221 machine.  So I think when the machine is disconnected for
a long time, other machine can receive the same file lock.


Re: HDFS HA IO Fencing

2012-10-26 Thread lei liu
We are using NFS for Shared storage,  Can we use linux nfslcok service to
implement IO Fencing ?

2012/10/26 Steve Loughran ste...@hortonworks.com



 On 25 October 2012 14:08, Todd Lipcon t...@cloudera.com wrote:

 Hi Liu,

 Locks are not sufficient, because there is no way to enforce a lock in a
 distributed system without unbounded blocking. What you might be referring
 to is a lease, but leases are still problematic unless you can put bounds
 on the speed with which clocks progress on different machines, _and_ have
 strict guarantees on the way each node's scheduler works. With Linux and
 Java, the latter is tough.


 on any OS running in any virtual environment, including EC2, time is
 entirely unpredictable, just to make things worse.


 On a single machine you can use file locking as the OS will know that the
 process is dead and closes the file; other programs can attempt to open the
 same file with exclusive locking -and, by getting the right failures, know
 that something else has the file, hence the other process is live. Shared
 NFS storage you need to mount with softlock set precisely to stop file
 locks lasting until some lease has expired, because the on-host liveness
 probes detect failure faster and want to react to it.


 -Steve



[no subject]

2012-10-25 Thread lei liu
http://blog.csdn.net/onlyqi/article/details/6544989
https://issues.apache.org/jira/browse/HDFS-2185
http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailability.html
http://blog.csdn.net/chenpingbupt/article/details/7922042
https://issues.apache.org/jira/browse/HADOOP-8163


use DistributedCache to add many files to class path

2011-02-16 Thread lei liu
I use DistributedCache  to add two files to class path,  exampe below code :
   String jeJarPath = /group/aladdin/lib/je-4.1.7.jar;
DistributedCache.addFileToClassPath(new Path(jeJarPath), conf);

String tairJarPath = /group/aladdin/lib/tair-aladdin-2.3.1.jar
DistributedCache.addFileToClassPath(new Path(jeJarPath), conf);

when map/reduce is executing,  the
/group/aladdin/lib/tair-aladdin-2.3.1.jar file is added to class path,
but  the /group/aladdin/lib/je-4.1.7.jar file is not added to class path.

How can I add many files to class path?



Thanks,


LiuLei


create local file in tasktracker node

2011-01-22 Thread lei liu
I want to use hadoop to  create Berkeley DB index, so I need create one
directory to store Berkeley DB index, There are below code in reduce :

String tmp = job.get(hadoop.tmp.dir);
String shardName = shard + this.shardNum + _ +
UUID.randomUUID().toString();
this.localIndexFile = new File(tmp, shardName);
if (!localIndexFile.exists()) {
boolean isSuccessfull = localIndexFile.mkdir();

LOG.info(create directory  + this.localIndexFile + :  +
isSuccessfull);
}


but the localIndexFile.mkdir() method return false, could everyone tell me
why the method return false, whether my reduce task instance don't have the
permission?



Thanks,


LiuLei


Dose one map instance only handle one input path at the same time?

2011-01-21 Thread lei liu
There are two input direcoties:/user/test1/ and /user/test2/ , I want to
join the two direcoties content, in order to join the two directories, I
need to identity the content are handled by mapper from which directory, so
I use below code in mapper:

private int tag = -1;
@Override
public void configure(JobConf conf) {
try {

this.conf = conf;
String pathsToAliasStr = conf.get(paths.to.alias);//example:
conf.set(paths.to.alias, 0=/user/test1/,1=/user/test2/
String[] pathsToAlias = pathsToAliasStr.split(,);

Path fpath = new Path((new Path(conf.get(map.input.file
))).toUri().getPath());
String path = fpath.toUri().toString();

for (int i = 0; i  pathsToAlias.length; i++) {
String[] pathToAlias = pathsToAlias[i].split(=);
if (path.startsWith(pathToAlias[1])) {
tag = Integer.valueOf(pathToAlias[0].trim());//identity
current map instatnce are handling which directory content.
}
}
} catch (Throwable e) {
e.printStackTrace();
throw new RuntimeException(e);
}

}

So when map method  run, the content are handled by the mapper are
identified for same direcoty.

I want to know whether one mapper instatnce only handle content of one
directory at same time.


Thanks

LiuLei


how does hadoop handle the counter of the failed task and speculative task

2010-12-25 Thread lei liu
I define the counter to count the bad records, there is below code in map
task;
 reporter.incrCounter(bad',
records', 1),

When the job is completed, the pritnt the result to use below code:
long total = counters.findCounter(bad,records).getCounter();


But I have two questions about the counter:
1. If the map task is retried 4 times, and the last map task is successful,
I think the counter of the other three map tasks should be not included in
ultima result, is that right?
2. If there are speculative tasks,  I think the counter of speculative tasks
should be not included in ultima result, is that right?


Thanks,

LiuLei


Virtual Columns error

2010-09-20 Thread lei liu
I use hive0.6 version and  execute 'select INPUT_FILE_NAME,
BLOCK_OFFSET_INSIDE_FILE from person1' statement,  hive0.6 throws below
error:
FAILED: Error in semantic analysis: line 1:7 Invalid Table Alias or Column
Reference INPUT_FILE_NAME error.

Don't hive0.6 support virtual columns?


how to create index on one table

2010-09-20 Thread lei liu
I use hive0.6 ,I want to create index on one table, how can I do ti?


Re: how to export create statement for one table

2010-09-19 Thread lei liu
I know the describe statement, the statement don't display the FIELDS
TERMINATED and LINES TERMINATED, it only display column name and column
type.

2010/9/19 Ted Yu yuzhih...@gmail.com

 See bottom of http://wiki.apache.org/hadoop/Hive/LanguageManual/DDL


 On Sat, Sep 18, 2010 at 7:13 PM, lei liu liulei...@gmail.com wrote:

 I use below statement to create one table:

 CREATE TABLE page_view(viewTime INT, userid BIGINT,
  page_url STRING, referrer_url STRING,
  ip STRING COMMENT 'IP Address of the User')
  COMMENT 'This is the page view table'
  PARTITIONED BY(dt STRING, country STRING)
  ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\001'
 STORED AS SEQUENCEFILE;


 Now I want to export the DDL for page_view table, how can I do it ?





add partition

2010-09-19 Thread lei liu
I use below statement to create one tabale and add one partition:
create external table test(userid bigint,name string, age int) partitioned
by(pt string);
alter table test add partition(pt='01');


Now there is one file in HDFS, the file path is /user/hive/warehouse/user, I
use load statement to load the file to partition: load data inpath
'/user/hive/warehouse/user'  into table test partition(pt='01'). I find the
file path is changed, form /user/hive/warehouse/user to
/user/hive/warehouse/test/pt=01. I want to do't change the file path, how
can I do it?


hwo to connection metastore server with hive JDBC client

2010-09-18 Thread lei liu
I use ./hive --service metastore  command to start metastore server, how to
connection metastore server  with hive JDBC client?


hwo to expore DDL statement for one table

2010-09-18 Thread lei liu
I use below statement to create one table:

CREATE TABLE page_view(viewTime INT, userid BIGINT,
 page_url STRING, referrer_url STRING,
 ip STRING COMMENT 'IP Address of the User')
 COMMENT 'This is the page view table'
 PARTITIONED BY(dt STRING, country STRING)
 ROW FORMAT DELIMITED
   FIELDS TERMINATED BY '\001'
STORED AS SEQUENCEFILE;


Now I want to expore the DDL for page_view, how can I do it ?


how to export create statement for one table

2010-09-18 Thread lei liu
I use below statement to create one table:

CREATE TABLE page_view(viewTime INT, userid BIGINT,
 page_url STRING, referrer_url STRING,
 ip STRING COMMENT 'IP Address of the User')
 COMMENT 'This is the page view table'
 PARTITIONED BY(dt STRING, country STRING)
 ROW FORMAT DELIMITED
   FIELDS TERMINATED BY '\001'
STORED AS SEQUENCEFILE;


Now I want to export the DDL for page_view table, how can I do it ?


GroupByOperator class confuse , it will result in out of memeory

2010-09-02 Thread lei liu
I find GroupByOperator cache the Aggregation results of different keys.
Please look below cod:
AggregationBuffer[] aggs = null;
boolean newEntryForHashAggr = false;

keyProber.hashcode = newKeys.hashCode();
// use this to probe the hashmap
keyProber.keys = newKeys;

// hash-based aggregations
aggs = hashAggregations.get(keyProber);
ArrayListObject newDefaultKeys = null;
if (aggs == null) {
  newDefaultKeys = deepCopyElements(keyObjects, keyObjectInspectors,
  ObjectInspectorCopyOption.WRITABLE);
  KeyWrapper newKeyProber = new KeyWrapper(keyProber.hashcode,
  newDefaultKeys, true);
  aggs = newAggregations();
  hashAggregations.put(newKeyProber, aggs);
  newEntryForHashAggr = true;
  numRowsHashTbl++; // new entry in the hash table
}



When there are 100 difference keys, and the value is 10k of each key,
that will occupy 10G memeory, the JVM will out of memeory.  Could anybody
tell me how to handle the question?



Thanks,

LiuLei


hive-0.6 don't connection mysql in metastore

2010-08-29 Thread lei liu
I use hive-0.6 an use mysql as metasore, but hive don't connection the
mysql.

2010-08-30 13:28:24,982 ERROR [main] util.Log4JLogger(125): Failed
initialising database.
Invalid URL: jdbc:mysql://127.0.0.1:3306/hive6?createDatabaseIfNotExist=true
org.datanucleus.exceptions.NucleusDataStoreException: Invalid URL:
jdbc:mysql://127.0.0.1:3306/hive6?createDatabaseIfNotExist=true
at
org.datanucleus.store.rdbms.ConnectionFactoryImpl$ManagedConnectionImpl.getConnection(ConnectionFactoryImpl.java:536)
at
org.datanucleus.store.rdbms.RDBMSStoreManager.init(RDBMSStoreManager.java:290)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown
Source)
at java.lang.reflect.Constructor.newInstance(Unknown Source)
at
org.datanucleus.plugin.NonManagedPluginRegistry.createExecutableExtension(NonManagedPluginRegistry.java:588)
at
org.datanucleus.plugin.PluginManager.createExecutableExtension(PluginManager.java:300)
at
org.datanucleus.ObjectManagerFactoryImpl.initialiseStoreManager(ObjectManagerFactoryImpl.java:161)
at
org.datanucleus.jdo.JDOPersistenceManagerFactory.freezeConfiguration(JDOPersistenceManagerFactory.java:583)
at
org.datanucleus.jdo.JDOPersistenceManagerFactory.createPersistenceManagerFactory(JDOPersistenceManagerFactory.java:286)
at
org.datanucleus.jdo.JDOPersistenceManagerFactory.getPersistenceManagerFactory(JDOPersistenceManagerFactory.java:182)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at javax.jdo.JDOHelper$16.run(JDOHelper.java:1958)
at java.security.AccessController.doPrivileged(Native Method)
at javax.jdo.JDOHelper.invoke(JDOHelper.java:1953)
at
javax.jdo.JDOHelper.invokeGetPersistenceManagerFactoryOnImplementation(JDOHelper.java:1159)
at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:803)
at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:698)
at
org.apache.hadoop.hive.metastore.ObjectStore.getPMF(ObjectStore.java:191)
at
org.apache.hadoop.hive.metastore.ObjectStore.getPersistenceManager(ObjectStore.java:208)
at
org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:153)
at
org.apache.hadoop.hive.metastore.ObjectStore.setConf(ObjectStore.java:128)
at
org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:54)
at
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:83)
at
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:276)
at
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.executeWithRetry(HiveMetaStore.java:228)
at
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:374)
at
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:166)
at
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:125)
at
org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.init(HiveServer.java:79)
at
org.apache.hadoop.hive.jdbc.HiveConnection.init(HiveConnection.java:85)
at org.apache.hadoop.hive.jdbc.HiveDriver.connect(HiveDriver.java:110)
at java.sql.DriverManager.getConnection(Unknown Source)
at java.sql.DriverManager.getConnection(Unknown Source)
at my.examples.multithreadquery.SimpleSql.main(SimpleSql.java:21)
Caused by: java.sql.SQLException: Invalid URL: jdbc:mysql://
127.0.0.1:3306/hive6?createDatabaseIfNotExist=true
at
org.apache.hadoop.hive.jdbc.HiveConnection.init(HiveConnection.java:76)
at org.apache.hadoop.hive.jdbc.HiveDriver.connect(HiveDriver.java:110)
at java.sql.DriverManager.getConnection(Unknown Source)
at java.sql.DriverManager.getConnection(Unknown Source)
at
org.apache.commons.dbcp.DriverManagerConnectionFactory.createConnection(DriverManagerConnectionFactory.java:75)
at
org.apache.commons.dbcp.PoolableConnectionFactory.makeObject(PoolableConnectionFactory.java:582)
at
org.apache.commons.pool.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:1148)
at
org.apache.commons.dbcp.PoolingDataSource.getConnection(PoolingDataSource.java:106)
at
org.datanucleus.store.rdbms.ConnectionFactoryImpl$ManagedConnectionImpl.getConnection(ConnectionFactoryImpl.java:521)
... 38 more



I find hive0.6 to connection mysql with
org.apache.hadoop.hive.jdbc.HiveDriver.  I think that is wrong, it should
use com.mysql.jdbc.Driver to connection mysql.
Below is my conifguration:

property
  namejavax.jdo.option.ConnectionURL/name
  valuejdbc:mysql://127.0.0.1:3306/hive6?createDatabaseIfNotExist=true
/value
  descriptionJDBC connect string for a JDBC 

hwo to hive add hive_exec.jar to hadoop

2010-08-24 Thread lei liu
When hadoop one job which is submmited by hive,  hadoop need the
hive_exec.jar,  hwo to  hive add hive_exec.jar to hadoop?
Please tell me the where are codes in hive.

Thanks,

LiuLei


Re: java.sql.SQLException: org.apache.thrift.transport.TTransportException: Cannot read. Remote side has closed. Tried to read 1 bytes, but only got 0 bytes.

2010-08-23 Thread lei liu
Yes, you are right. I do that, but after the hive server run several days,
when client connection the hive server, the client receive the exception.

2010/8/23 Adarsh Sharma adarsh.sha...@orkash.com

 For Running Hive in Server Mode ..
 First U have to start service of hiveserver ::


 *$bin/hive --service hiveserver
 *

 and then run the code


 lei liu wrote:

 Hello everyone,


 I use JDBC to connection the hive server, sometime I receive below
 exception:
 java.sql.SQLException: org.apache.thrift.transport.TTransportException:
 Cannot read. Remote side has closed. Tried to read 1 bytes, but only got 0
 bytes.

 Please tell me the eason.


 Thanks


 LiuLei





Re: Re: how to support chinese in hive

2010-08-16 Thread lei liu
Hi shangan,

You need to set linux coding is UTF-8.

2010/8/16 shangan shan...@corp.kaixin001.com

  the fact is that even I hava data in UTF-8 using simplified Chinese, then
 doing a select * it will return an unreadable result. Does that mean hive
 can only support ascii character ?

 2010-08-16
 --
  shangan
 --
 *发件人:* Jeff Hammerbacher
 *发送时间:* 2010-08-15  07:19:27
 *收件人:* hive-user
 *抄送:*
 *主题:* Re: how to support chinese in hive
  Hey shangan,

 There's a ticket open to make Hive work with non-UTF-8 codecs at
 https://issues.apache.org/jira/browse/hive-1505. Perhaps you could add
 more about your needs there?

 Later,
 Jeff

 On Fri, Aug 13, 2010 at 4:02 AM, shangan shan...@corp.kaixin001.comwrote:

  hi,all
 Could anyone tell me how to configurate hive in order to support Chinese
 characters ?

 And when using hwi,how to configure directory of the result file, by
 default now it is the 'conf' directory under my installation path.

 2010-08-13
 --
 shangan





Re: what is difference hive local model and standalone model.

2010-08-14 Thread lei liu
You can look the http://wiki.apache.org/hadoop/Hive/HiveClient page. For
local mode the uri is jdbc:hive://, for standalone mode the uri is
jdbc:hive://host:port/dbname.  when we use local mode, my application and
hive server run in same VM, so we don't need to maintain hive server. I
think that is advantage to local mode. I want to know what is disadvantage
when we use local mode.

2010/8/14 Joydeep Sen Sarma jssa...@facebook.com

  Lei – not sure I understand the question. I tried to document the
 relationship between hive, MR and local-mode at
 http://wiki.apache.org/hadoop/Hive/GettingStarted#Hive.2C_Map-Reduce_and_Local-Moderecently.
  perhaps you have already read it.



 Regarding whether local mode can be run on windows or not – I really don’t
 know. First of all – hadoop has to be runnable in local mode on windows
 (using cygwin I presume?). then one has to test hive against this – one
 would think it should work if hadoop does – but we would have to verify.



 (ie. yes – it should be possible in theory – but in practice – there are
 probably bugs that need to get sorted out for this to happen).


  --

 *From:* lei liu [mailto:liulei...@gmail.com]
 *Sent:* Friday, August 13, 2010 9:10 AM
 *To:* hive-user@hadoop.apache.org
 *Subject:* what is difference hive local model and standalone model.



 what is difference hive local model and standalone model. Can the hive
 local model be ran in windows?



what is difference hive local model and standalone model.

2010-08-13 Thread lei liu
what is difference hive local model and standalone model. Can the hive local
model be ran in windows?


Re: Hwo to use JDBC client embedded mode

2010-08-11 Thread lei liu
Thank you for your reply. I have looked
http://wiki.apache.org/hadoop/Hive/HiveClient#JDBC page before. what is mean
the embedded mode mentioned in the page? Is that hive embedded mode? I mean
that I don't need to start hive, the hive server can be embedded to my
application, my application don't need  connection to access the hive
server.
2010/8/11 Bill Graham billgra...@gmail.com

 The code and start script shown in this section of the wiki shows how to
 run hive in embedded mode.

 http://wiki.apache.org/hadoop/Hive/HiveClient#JDBC

 Compile the code after changing the JDBC URI to 'jdbc:hive://' and run the
 example script. This will run the code, which will start Hive in embedded
 mode, create a table, do some operations on it, and then drop it.



 On Tue, Aug 10, 2010 at 8:05 AM, lei liu liulei...@gmail.com wrote:

 Can anybody answer the question?

 Thanks,

 LiuLei

 2010/8/10 lei liu liulei...@gmail.com

 I look see below content in 
 http://wiki.apache.org/hadoop/Hive/HiveClientpage: For embedded mode, uri is 
 just jdbc:hive://.   How can I use JDBC
 client embedded mode? Could anybody give me an example?






Re: How to merge small files

2010-08-10 Thread lei liu
Thank you for your reply.

Could you tell me why it is slower if the two paremeters are true and how
slow it is?

2010/8/10 Namit Jain nj...@facebook.com

 Yes, it will try to run another map-reduce job to merge the files
 
 From: lei liu [liulei...@gmail.com]
 Sent: Monday, August 09, 2010 8:57 AM
 To: hive-user@hadoop.apache.org
 Subject: Re: How to merge small files

 Could you tell me whether the query is slower if I two parameters both are
 true?

 2010/8/9 Namit Jain nj...@facebook.commailto:nj...@facebook.com
 That's right

 
 From: lei liu [liulei...@gmail.commailto:liulei...@gmail.com]
 Sent: Sunday, August 08, 2010 7:18 PM
 To: hive-user@hadoop.apache.orgmailto:hive-user@hadoop.apache.org
 Subject: Re: How to merge small files

 Thank you for your reply.

 Your mean is I will execute below statement:

 statement.execute(set hive.merge.mapfiles=true);
 statement.execute(set hive.merge.mapredfiles=true);

 The two parementers are both true, right?

 2010/8/6 Namit Jain nj...@facebook.commailto:nj...@facebook.commailto:
 nj...@facebook.commailto:nj...@facebook.com
   HIVEMERGEMAPFILES(hive.merge.mapfiles, true),
  HIVEMERGEMAPREDFILES(hive.merge.mapredfiles, false),


 Set the above parameters to true before your query.



 
 From: lei liu [liulei...@gmail.commailto:liulei...@gmail.commailto:
 liulei...@gmail.commailto:liulei...@gmail.com]
 Sent: Thursday, August 05, 2010 8:47 PM
 To: hive-user@hadoop.apache.orgmailto:hive-user@hadoop.apache.org
 mailto:hive-user@hadoop.apache.orgmailto:hive-user@hadoop.apache.org
 Subject: How to merge small files

 When I run below sql:  INSERT OVERWRITE TABLE tablename1 select_statement1
 FROM from_statement, there are many files which size is zero are stored to
 hadoop,

 How can I merge these small files?

 Thanks,



 LiuLei






how to call the UDF/UDAF in hive

2010-08-09 Thread lei liu
Hello everyone,

Could everybody tell me how to call UDF/UDAF in hive?


Re: How to merge small files

2010-08-09 Thread lei liu
Could you tell me whether the query is slower if I two parameters both are
true?

2010/8/9 Namit Jain nj...@facebook.com

 That's right

 
 From: lei liu [liulei...@gmail.com]
 Sent: Sunday, August 08, 2010 7:18 PM
 To: hive-user@hadoop.apache.org
 Subject: Re: How to merge small files

 Thank you for your reply.

 Your mean is I will execute below statement:

 statement.execute(set hive.merge.mapfiles=true);
 statement.execute(set hive.merge.mapredfiles=true);

 The two parementers are both true, right?

 2010/8/6 Namit Jain nj...@facebook.commailto:nj...@facebook.com
   HIVEMERGEMAPFILES(hive.merge.mapfiles, true),
   HIVEMERGEMAPREDFILES(hive.merge.mapredfiles, false),


 Set the above parameters to true before your query.



 
 From: lei liu [liulei...@gmail.commailto:liulei...@gmail.com]
 Sent: Thursday, August 05, 2010 8:47 PM
 To: hive-user@hadoop.apache.orgmailto:hive-user@hadoop.apache.org
  Subject: How to merge small files

 When I run below sql:  INSERT OVERWRITE TABLE tablename1 select_statement1
 FROM from_statement, there are many files which size is zero are stored to
 hadoop,

 How can I merge these small files?

 Thanks,



 LiuLei





Hwo to use JDBC client embedded mode

2010-08-09 Thread lei liu
I look see below content in
http://wiki.apache.org/hadoop/Hive/HiveClientpage: For embedded mode,
uri is just jdbc:hive://.   How can I use JDBC
client embedded mode? Could anybody give me an example?


Re: How to merge small files

2010-08-08 Thread lei liu
Thank you for your reply.

Your mean is I will execute below statement:

statement.execute(set hive.merge.mapfiles=true);
statement.execute(set hive.merge.mapredfiles=true);

The two parementers are both true, right?

2010/8/6 Namit Jain nj...@facebook.com

HIVEMERGEMAPFILES(hive.merge.mapfiles, true),
HIVEMERGEMAPREDFILES(hive.merge.mapredfiles, false),


 Set the above parameters to true before your query.



 
 From: lei liu [liulei...@gmail.com]
 Sent: Thursday, August 05, 2010 8:47 PM
 To: hive-user@hadoop.apache.org
 Subject: How to merge small files

 When I run below sql:  INSERT OVERWRITE TABLE tablename1 select_statement1
 FROM from_statement, there are many files which size is zero are stored to
 hadoop,

 How can I merge these small files?

 Thanks,



 LiuLei




JDBC embedded mode

2010-08-08 Thread lei liu
How can I use the embedded mode of JDBC, could anybody give me an example?


how to debug code in org.apache.hadoop.hive.ql.exec package

2010-08-06 Thread lei liu
how can I debug code in org.apache.hadoop.hive.ql.exec package?


Re: why is slow when use OR clause instead of IN clause

2010-08-05 Thread lei liu
When there are one thousand OR clause, the hive appear below exception:
Total MapReduce jobs = 1
Number of reduce tasks is set to 0 since there's no reduce operator
java.lang.StackOverflowError
at java.beans.Statement.init(Statement.java:60)
at java.beans.Expression.init(Expression.java:47)
at java.beans.Expression.init(Expression.java:65)
at
java.beans.PrimitivePersistenceDelegate.instantiate(MetaData.java:79)
at
java.beans.PersistenceDelegate.writeObject(PersistenceDelegate.java:97)
at java.beans.Encoder.writeObject(Encoder.java:54)
at java.beans.XMLEncoder.writeObject(XMLEncoder.java:257)
at java.beans.Encoder.writeObject1(Encoder.java:206)
at java.beans.Encoder.cloneStatement(Encoder.java:219)
at java.beans.Encoder.writeExpression(Encoder.java:278)
at java.beans.XMLEncoder.writeExpression(XMLEncoder.java:372)
at
java.beans.PersistenceDelegate.writeObject(PersistenceDelegate.java:97)
at java.beans.Encoder.writeObject(Encoder.java:54)
at java.beans.XMLEncoder.writeObject(XMLEncoder.java:257)
at java.beans.Encoder.writeObject1(Encoder.java:206)
at java.beans.Encoder.cloneStatement(Encoder.java:219)
at java.beans.Encoder.writeExpression(Encoder.java:278)
at java.beans.XMLEncoder.writeExpression(XMLEncoder.java:372)
at
java.beans.PersistenceDelegate.writeObject(PersistenceDelegate.java:97)
at java.beans.Encoder.writeObject(Encoder.java:54)
at java.beans.XMLEncoder.writeObject(XMLEncoder.java:257)
at java.beans.Encoder.writeExpression(Encoder.java:279)
at java.beans.XMLEncoder.writeExpression(XMLEncoder.java:372)
at
java.beans.DefaultPersistenceDelegate.doProperty(DefaultPersistenceDelegate.java:212)
at
java.beans.DefaultPersistenceDelegate.initBean(DefaultPersistenceDelegate.java:247)
at
java.beans.DefaultPersistenceDelegate.initialize(DefaultPersistenceDelegate.java:395)
at
java.beans.PersistenceDelegate.writeObject(PersistenceDelegate.java:100).



When there are two hundred OR clause, it is very very slow.

Now I use 0.4.1 version, if I upgrade to 0.6 version, which things I need to
do?

In addition, when is the 0.6 version is released?

Thanks,


LiuLei

2010/8/5 Ning Zhang nzh...@facebook.com

 I tested (1000 disjunctions) and it was extremely slow but no OOM. The
 issue seems to be the fact that we serialize the plan by writing to HDFS
 file directly. We probably should cache it locally and then write it to
 HDFS.

 On Aug 4, 2010, at 10:23 AM, Edward Capriolo wrote:

  On Wed, Aug 4, 2010 at 1:15 PM, Ning Zhang nzh...@facebook.com wrote:
  Currently an expression tree (series of ORs in this case) is not
 collapsed to one operator or any other optimizations. It would be great to
 have this optimization rule to convert an OR operator tree to one IN
 operator. Would you be able to file a JIRA and contribute a patch?
 
  On Aug 4, 2010, at 7:46 AM, Mark Tozzi wrote:
 
  I haven't looked at the code, but I assume the query parser would sort
  the 'in' terms and then do a binary search lookup into them for each
  row, while the 'or' terms don't have that kind of obvious relationship
  and are probably tested in sequence.  This would give the in O(log N)
  performance compared to a chain of or's having O(N) performance, per
  row queried.  For large N, that could add up.  That being said, I'm
  just speculating here.  The query parser may be smart enough to
  optimize the related or's in the same way, or it may not optimize that
  at all.  If I get a chance, I'll try to dig around and see what it's
  doing, as I have also had a lot of large 'in' queries and could use
  every drop of performance I can get.
 
  --Mark
 
  On Wed, Aug 4, 2010 at 9:47 AM, Edward Capriolo edlinuxg...@gmail.com
 wrote:
  On Wed, Aug 4, 2010 at 6:10 AM, lei liu liulei...@gmail.com wrote:
  Because my company reuire we use 0.4.1 version, the version don't
 support IN
  clause. I want to  use the OR clause(example:where id=1 or id=2 or
 id=3) to
  implement the IN clause(example: id in(1,2,3) ).  I know it will be
 slower
  especially when the list after in is very long.  Could anybody can
 tell me
  why is slow when use OR clause to implement In clause?
 
 
  Thanks,
 
 
  LiuLei
 
 
  I can not imagine the performance difference between 'or' or 'in'
  would be that great but I never benchmarked it. The big looming
  problems is that if you string enough 'or' together (say 8000) the
  query parser which uses java beans serialization will OOM.
 
  Edward
 
 
 
 
  For reference I did this as a test case
  SELECT * FROM src where
  key=0 OR key=0 OR key=0 OR  key=0 OR key=0 OR key=0 OR key=0 OR key=0
  OR key=0 OR key=0 OR key=0 OR
  key=0 OR key=0 OR key=0 OR  key=0 OR key=0 OR key=0 OR key=0 OR key=0
  OR key=0 OR key=0 OR key=0 OR
  ...(100 more of these)
 
  No OOM but I gave up after the test case did not go

hwo to debug hive and hadoop

2010-08-05 Thread lei liu
I have used 'Remote Java Application' in eclipse to debug hive code, now I
want to debug hive and hadoop together, how can I do it?



Thanks,

LiuLei


How to merge small files

2010-08-05 Thread lei liu
When I run below sql:  INSERT OVERWRITE TABLE tablename1
select_statement1 FROM from_statement, there are many files which size
is zero are stored to hadoop,

How can I merge these small files?

Thanks,


LiuLei


why is slow when use OR clause instead of IN clause

2010-08-04 Thread lei liu
Because my company reuire we use 0.4.1 version, the version don't support IN
clause. I want to  use the OR clause(example:where id=1 or id=2 or id=3) to
implement the IN clause(example: id in(1,2,3) ).  I know it will be slower
especially when the list after in is very long.  Could anybody can tell me
why is slow when use OR clause to implement In clause?


Thanks,


LiuLei


Re: why is slow when use OR clause instead of IN clause

2010-08-04 Thread lei liu
Hello Edward Capriolo,

Thank you for your reply. Are you sure that if you string enough 'or'
together (say 8000) the query parser which uses java beans serialization
will OOM? How many memory you assign to hive?

2010/8/4 Edward Capriolo edlinuxg...@gmail.com

 On Wed, Aug 4, 2010 at 6:10 AM, lei liu liulei...@gmail.com wrote:
  Because my company reuire we use 0.4.1 version, the version don't support
 IN
  clause. I want to  use the OR clause(example:where id=1 or id=2 or id=3)
 to
  implement the IN clause(example: id in(1,2,3) ).  I know it will be
 slower
  especially when the list after in is very long.  Could anybody can tell
 me
  why is slow when use OR clause to implement In clause?
 
 
  Thanks,
 
 
  LiuLei
 

 I can not imagine the performance difference between 'or' or 'in'
 would be that great but I never benchmarked it. The big looming
 problems is that if you string enough 'or' together (say 8000) the
 query parser which uses java beans serialization will OOM.

 Edward



Re: why is slow when use OR clause instead of IN clause

2010-08-04 Thread lei liu
Now I assign 100M memory to hive, you consider that can support how many
'OR' string?

2010/8/5 Edward Capriolo edlinuxg...@gmail.com

 On Wed, Aug 4, 2010 at 12:15 PM, lei liu liulei...@gmail.com wrote:
  Hello Edward Capriolo,
 
  Thank you for your reply. Are you sure that if you string enough 'or'
  together (say 8000) the query parser which uses java beans serialization
  will OOM? How many memory you assign to hive?
 
  2010/8/4 Edward Capriolo edlinuxg...@gmail.com
 
  On Wed, Aug 4, 2010 at 6:10 AM, lei liu liulei...@gmail.com wrote:
   Because my company reuire we use 0.4.1 version, the version don't
   support IN
   clause. I want to  use the OR clause(example:where id=1 or id=2 or
 id=3)
   to
   implement the IN clause(example: id in(1,2,3) ).  I know it will be
   slower
   especially when the list after in is very long.  Could anybody can
   tell me
   why is slow when use OR clause to implement In clause?
  
  
   Thanks,
  
  
   LiuLei
  
 
  I can not imagine the performance difference between 'or' or 'in'
  would be that great but I never benchmarked it. The big looming
  problems is that if you string enough 'or' together (say 8000) the
  query parser which uses java beans serialization will OOM.
 
  Edward
 
 

 That is exactly what I am saying. I tested with 4GB and 8GB. I am not
 exactly sure how many OR's you can get away with for your memory size,
 but some upper limit exists currently. Most people never hit it. (I
 did because my middle name is edge case )