CDH4.5 HiveServer2 InterruptedException
Hi, I'm using CDH4.5 and its built-in HiveServer2. Sometimes it throws the following exception, and the job cannot be submitted: 2014-08-18 09:16:33,346 INFO org.apache.hadoop.hive.ql.exec.ExecDriver: Making Temp Directory: hdfs://nameservice1/tmp/hive-hive-hadoop/hive_2014-08-18_09-16-32_093_3323860800312087449-967/-ext-10001 2014-08-18 09:16:33,350 WARN org.apache.hadoop.ipc.Client: interrupted waiting to send params to server java.lang.InterruptedException at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1279) at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:218) at java.util.concurrent.FutureTask.get(FutureTask.java:83) at org.apache.hadoop.ipc.Client$Connection.sendParam(Client.java:924) at org.apache.hadoop.ipc.Client.call(Client.java:1211) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202) at $Proxy14.mkdirs(Unknown Source) I googled around and this bug comes up: https://issues.apache.org/jira/browse/HADOOP-6762 Is it related? Or there's something else I can do to prevent this? Thanks.
why webchat_server listen to port 8080
hi everyone: I install Hive 0.13 and I find configuration in HIVE_HOME/hcatalog/etc/webhcat/webhcat-default.xml: templeton.port 50111 but when I start webchat_server, it listened to port 8080. no...@sina.cn
why webchat_server listen to port 8080
hi everyone: I install Hive 0.13 and I find configuration in HIVE_HOME/hcatalog/etc/webhcat/webhcat-default.xml: templeton.port 50111 but when I start webchat_server, it listened to port 8080. no...@sina.cn
Re: New lines causing new rows
Hi, Charles, What's the storage format for the raw data source? What's the definition of your view? On 18 August 2014 04:20, Charles Robertson wrote: > HI all, > > I am loading some data into a Hive table, and one of the fields contains > text which I believe contains new line characters. I have a view which > reads data from this table, and the new line characters appear to be > starting new rows > > Doing 'select * from [mytable] limit 10;' in the hive console returns ten > rows, on more than ten lines. Doing 'select * from [view] limit 10' in the > console return ten lines but fewer than ten rows. > > I've tried using the 'translate' function in the view definition to > replace \r with a space character, but that seems to have just broken > everything (it complains of a missing EOF). > > Can anyone suggest a better way to remove the line breaks and/or prevent > the view treating them as new rows? > > Thanks, > Charles > -- André Araújo Big Data Consultant/Solutions Architect The Pythian Group - Australia - www.pythian.com Office (calls from within Australia): 1300 366 021 x1270 Office (international): +61 2 8016 7000 x270 *OR* +1 613 565 8696 x1270 Mobile: +61 410 323 559 Fax: +61 2 9805 0544 IM: pythianaraujo @ AIM/MSN/Y! or ara...@pythian.com @ GTalk “Success is not about standing at the top, it's the steps you leave behind.” — Iker Pou (rock climber) -- --
New lines causing new rows
HI all, I am loading some data into a Hive table, and one of the fields contains text which I believe contains new line characters. I have a view which reads data from this table, and the new line characters appear to be starting new rows Doing 'select * from [mytable] limit 10;' in the hive console returns ten rows, on more than ten lines. Doing 'select * from [view] limit 10' in the console return ten lines but fewer than ten rows. I've tried using the 'translate' function in the view definition to replace \r with a space character, but that seems to have just broken everything (it complains of a missing EOF). Can anyone suggest a better way to remove the line breaks and/or prevent the view treating them as new rows? Thanks, Charles
Re: SerDe errors
Hi Roberto, This got solved with the help from another user - the e-mails don't seem to have made it to the user list. There was a problem with the json serde which means it didn't seem to like deserialising an object nested inside the main object. Changing to the Amazon serde fixed it. Thanks, Charles On 14 August 2014 17:49, Roberto Congiu wrote: > Can you provide the CREATE statement used to create the table and a sample > of the json that's causing the error ? > It sounds like you have a field declared as bigint on the schema, but it's > actually an object. > > > On Wed, Aug 13, 2014 at 5:05 AM, Charles Robertson < > charles.robert...@gmail.com> wrote: > >> Hi all, >> >> I have a Hive table which relies on a JSON SerDe to read the underlying >> files. When I ran the create script I specified the SerDe and it all went >> fine and the data was visible in the views above the table. When I tried to >> query the table directly, though, I received a ClassNotFound error. I >> solved this by putting the SerDe JAR in /usr/lib/hive/lib. >> >> Now, however, when I try to query the data I get: >> >> Failed with exception >> java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: >> java.lang.ClassCastException: org.json.JSONObject cannot be cast to >> [Ljava.lang.Object; >> >> (The serde is the json serde provided by Apache) >> >> Can anyone suggest why it was working before, but no longer is? >> >> Thanks, >> Charles >> > > > > -- > -- > Good judgement comes with experience. > Experience comes with bad judgement. > -- > Roberto Congiu - Data Engineer - OpenX > tel: +1 626 466 1141 >
Re: Hive queries returning all NULL values.
Do your field names in your parquet files contain upper case letters by any chance ex. userName? Hive will not read the data of external tables if they are not completely lower case field names, it doesn't convert them properly in the case of external tables. On Aug 17, 2014 8:00 AM, "hadoop hive" wrote: > Take a small set of data like 2-5 line and insert it... > > After that you can try insert first 10 column and then next 10 till you > fund your problematic column > On Aug 17, 2014 8:37 PM, "Tor Ivry" wrote: > >> Is there any way to debug this? >> >> We are talking about many fields here. >> How can I see which field has the mismatch? >> >> >> >> On Sun, Aug 17, 2014 at 4:30 PM, hadoop hive >> wrote: >> >>> Hi, >>> >>> You check the data type you have provided while creating external table, >>> it should match with data in files. >>> >>> Thanks >>> Vikas Srivastava >>> On Aug 17, 2014 7:07 PM, "Tor Ivry" wrote: >>> Hi I have a hive (0.11) table with the following create syntax: CREATE EXTERNAL TABLE events( … ) PARTITIONED BY(dt string) ROW FORMAT SERDE 'parquet.hive.serde.ParquetHiveSerDe' STORED AS INPUTFORMAT "parquet.hive.DeprecatedParquetInputFormat" OUTPUTFORMAT "parquet.hive.DeprecatedParquetOutputFormat" LOCATION '/data-events/success’; Query runs fine. I add hdfs partitions (containing snappy.parquet files). When I run hive > select count(*) from events where dt=“20140815” I get the correct result *Problem:* When I run hive > select * from events where dt=“20140815” limit 1; I get OK NULL NULL NULL NULL NULL NULL NULL 20140815 *The same query in Impala returns the correct values.* Any idea what could be the issue? Thanks Tor >>> >>
Re: Hive queries returning all NULL values.
Take a small set of data like 2-5 line and insert it... After that you can try insert first 10 column and then next 10 till you fund your problematic column On Aug 17, 2014 8:37 PM, "Tor Ivry" wrote: > Is there any way to debug this? > > We are talking about many fields here. > How can I see which field has the mismatch? > > > > On Sun, Aug 17, 2014 at 4:30 PM, hadoop hive wrote: > >> Hi, >> >> You check the data type you have provided while creating external table, >> it should match with data in files. >> >> Thanks >> Vikas Srivastava >> On Aug 17, 2014 7:07 PM, "Tor Ivry" wrote: >> >>> Hi >>> >>> >>> >>> I have a hive (0.11) table with the following create syntax: >>> >>> >>> >>> CREATE EXTERNAL TABLE events( >>> >>> … >>> >>> ) >>> >>> PARTITIONED BY(dt string) >>> >>> ROW FORMAT SERDE 'parquet.hive.serde.ParquetHiveSerDe' >>> >>> STORED AS >>> >>> INPUTFORMAT "parquet.hive.DeprecatedParquetInputFormat" >>> >>> OUTPUTFORMAT "parquet.hive.DeprecatedParquetOutputFormat" >>> >>> LOCATION '/data-events/success’; >>> >>> >>> >>> Query runs fine. >>> >>> >>> I add hdfs partitions (containing snappy.parquet files). >>> >>> >>> >>> When I run >>> >>> hive >>> >>> > select count(*) from events where dt=“20140815” >>> >>> I get the correct result >>> >>> >>> >>> *Problem:* >>> >>> When I run >>> >>> hive >>> >>> > select * from events where dt=“20140815” limit 1; >>> >>> I get >>> >>> OK >>> >>> NULL NULL NULL NULL NULL NULL NULL 20140815 >>> >>> >>> >>> *The same query in Impala returns the correct values.* >>> >>> >>> >>> Any idea what could be the issue? >>> >>> >>> >>> Thanks >>> >>> Tor >>> >> >
Re: Hive queries returning all NULL values.
Is there any way to debug this? We are talking about many fields here. How can I see which field has the mismatch? On Sun, Aug 17, 2014 at 4:30 PM, hadoop hive wrote: > Hi, > > You check the data type you have provided while creating external table, > it should match with data in files. > > Thanks > Vikas Srivastava > On Aug 17, 2014 7:07 PM, "Tor Ivry" wrote: > >> Hi >> >> >> >> I have a hive (0.11) table with the following create syntax: >> >> >> >> CREATE EXTERNAL TABLE events( >> >> … >> >> ) >> >> PARTITIONED BY(dt string) >> >> ROW FORMAT SERDE 'parquet.hive.serde.ParquetHiveSerDe' >> >> STORED AS >> >> INPUTFORMAT "parquet.hive.DeprecatedParquetInputFormat" >> >> OUTPUTFORMAT "parquet.hive.DeprecatedParquetOutputFormat" >> >> LOCATION '/data-events/success’; >> >> >> >> Query runs fine. >> >> >> I add hdfs partitions (containing snappy.parquet files). >> >> >> >> When I run >> >> hive >> >> > select count(*) from events where dt=“20140815” >> >> I get the correct result >> >> >> >> *Problem:* >> >> When I run >> >> hive >> >> > select * from events where dt=“20140815” limit 1; >> >> I get >> >> OK >> >> NULL NULL NULL NULL NULL NULL NULL 20140815 >> >> >> >> *The same query in Impala returns the correct values.* >> >> >> >> Any idea what could be the issue? >> >> >> >> Thanks >> >> Tor >> >
Re: Hive queries returning all NULL values.
Hi, You check the data type you have provided while creating external table, it should match with data in files. Thanks Vikas Srivastava On Aug 17, 2014 7:07 PM, "Tor Ivry" wrote: > Hi > > > > I have a hive (0.11) table with the following create syntax: > > > > CREATE EXTERNAL TABLE events( > > … > > ) > > PARTITIONED BY(dt string) > > ROW FORMAT SERDE 'parquet.hive.serde.ParquetHiveSerDe' > > STORED AS > > INPUTFORMAT "parquet.hive.DeprecatedParquetInputFormat" > > OUTPUTFORMAT "parquet.hive.DeprecatedParquetOutputFormat" > > LOCATION '/data-events/success’; > > > > Query runs fine. > > > I add hdfs partitions (containing snappy.parquet files). > > > > When I run > > hive > > > select count(*) from events where dt=“20140815” > > I get the correct result > > > > *Problem:* > > When I run > > hive > > > select * from events where dt=“20140815” limit 1; > > I get > > OK > > NULL NULL NULL NULL NULL NULL NULL 20140815 > > > > *The same query in Impala returns the correct values.* > > > > Any idea what could be the issue? > > > > Thanks > > Tor >
Hive queries returning all NULL values.
Hi I have a hive (0.11) table with the following create syntax: CREATE EXTERNAL TABLE events( … ) PARTITIONED BY(dt string) ROW FORMAT SERDE 'parquet.hive.serde.ParquetHiveSerDe' STORED AS INPUTFORMAT "parquet.hive.DeprecatedParquetInputFormat" OUTPUTFORMAT "parquet.hive.DeprecatedParquetOutputFormat" LOCATION '/data-events/success’; Query runs fine. I add hdfs partitions (containing snappy.parquet files). When I run hive > select count(*) from events where dt=“20140815” I get the correct result *Problem:* When I run hive > select * from events where dt=“20140815” limit 1; I get OK NULL NULL NULL NULL NULL NULL NULL 20140815 *The same query in Impala returns the correct values.* Any idea what could be the issue? Thanks Tor
Hive queries returning all NULL values.
Hi I have a hive (0.11) table with the following create syntax: CREATE EXTERNAL TABLE events( … ) PARTITIONED BY(dt string) ROW FORMAT SERDE 'parquet.hive.serde.ParquetHiveSerDe' STORED AS INPUTFORMAT "parquet.hive.DeprecatedParquetInputFormat" OUTPUTFORMAT "parquet.hive.DeprecatedParquetOutputFormat" LOCATION '/data-events/success’; Query runs fine. I add hdfs partitions (containing snappy.parquet files). When I run hive > select count(*) from events where dt=“20140815” I get the correct result *Problem:* When I run hive > select * from events where dt=“20140815” limit 1; I get OK NULL NULL NULL NULL NULL NULL NULL 20140815 *The same query in Impala returns the correct values.* Any idea what could be the issue? Thanks Tor
Hive queries returning all NULL values.
Hi I have a hive (0.11) table with the following create syntax: CREATE EXTERNAL TABLE events( … ) PARTITIONED BY(dt string) ROW FORMAT SERDE 'parquet.hive.serde.ParquetHiveSerDe' STORED AS INPUTFORMAT "parquet.hive.DeprecatedParquetInputFormat" OUTPUTFORMAT "parquet.hive.DeprecatedParquetOutputFormat" LOCATION '/data-events/success’; Query runs fine. I add hdfs partitions (containing snappy.parquet files). When I run hive > select count(*) from events where dt=“20140815” I get the correct result *Problem:* When I run hive > select * from events where dt=“20140815” limit 1; I get OK NULL NULL NULL NULL NULL NULL NULL 20140815 *The same query in Impala returns the correct values.* Any idea what could be the issue? Thanks Tor