subject:"impala"

Re: [How to :]Apache Impala and S3n File System

2016-09-19 Thread Wei-Chiu Chuang

I think the Impala user mailing list may have better answer there.
Forwarding your question to there.

-BEGIN PGP MESSAGE-
Comment: GPGTools - https://gpgtools.org

hQIMAyfFM+08xiG6AQ/9GB/XiNSJSyxv/tWQhcHqUxLa7+gzg3kTvzAWhtkV9VAD
bjwzhnP5BTnrdGFuJ8jrGSywy/wWOgq8TIalcSx/zzzaeQC3pQBzQIBy78gadlrq
qsP5FAFsNI8LuGteDn6+2rO3WpkWYBv7SgeICCuuqdEJi6rnLKW2mqSh7fpv0XMu
OPXjHM+4F9+kDZGyOx7F3aIdGJQBfBt68kNqK9Yrqhkxoy3P5ohohjE2XauqliCb
7zmrwpcuqSton9pOIjOIbxgJenLgrVvl250aYaGt1ScUEMXtnswcyeuZrLUzpmHa
mG+a4ZKPkYTuwCq1bAl73IueMTjc/Ze/TPSPdW9s1g/Lveceggrxcdl9AVkdL9kT
SBnd9BMTExy6stbgc6qbuxiyvEvhd66KeLL6EtXwOXfTgcHgNmkcJn5MkUx0wtE5
kRbrdexvdrjmFn8IDeFhQZqc3vYOvVo3qNu6B0zSYkZe/g1AAcs+u4KMTuvFlkqk
cckySavOg/YEvPTyBWDzOGoyIjljL8A1C6KQARWzT8L7a+Yc1Js9H5/1Lq+cYWA4
X67YaT5x14BJnPkNzNWdhKixojv/MkWnocIbFZWOXsjpag8KobJ+EP277VdSX7AL
7bIaXDrl0nS8igRGxdvmfdly2awo3GB7VCPemZkgbmERamLBpBfVEUNEfBObR4nS
6QHJY/v/fQg0pNYkySmaAH+i7iTzn9gn+KMa5XcXnWQGUkok5W6xNQOmIe4F36PM
6eYrZuNmvQNOK1xVrLgQMRbUhrwpxyAKsvfmaVQXfTfPygLfpnJZZCi3wLVU13zP
FIcDQgjcZMVT+AGbLQufIuC3I+sKQAp6xuSLDunlfwqkMSQGGDciptOEx35TeKMP
oQQmEJ4OUQ7aoB5Ep7WGWUJ6SCBRtN4mtoJM1zT/tQIDnXia7Ri37J7vrudn6+3H
NsmAg3rE9czeN/OpHIY1yiQjoqIblRTCiIwrESNLhSSywWHXv3AhNNIk6e/cmsL/
R2tR7PjX07mS+r3mRyZGzdJ2apI/y4Dkj2Om9d+8/2WU3nkmgbQ2vlvXaMKrVCQM
PwxlZCn4hUgAr/GMi/b9x9Irp8SFhcOH8xqhiAIdBPaePJNIFfbwyT3dzGVgXbh0
xbV16Q9cUO2Fraaa2ik65It+QAiVo/Xz8oiyz9XflovCDtV3/+cZ/rX6t6/OCvgI
3Y83Mll+zGkc4JsVikQLIRFn6co5juOM1rqPY2U2DWv5q1e/UmN7XUYToZhoeUwe
h3bdk5uPZuONPEd+3jFRfKt1NBHZWSciAHjkiki1Q9f7J1t6L5FiBFJtKKkUE5mb
btHV8GXOIJXwr4FiKSSvuqtMjm0zm1VBj9K/wE4vP7Z1pUMN7vY0xmPYwtS/ffIp
2seMTOXYrxEj1soHdvd6Vu9DExsaX0rj1OaXoYcyuO8/E5rNFDAC8uOtxeABmhcq
FdWEEFnko9l16wfyIbIdK8CCIVTMZUoKjkfow+5u9Ki0mAM9CdaiT7iYopTyzNd6
sStCy+pc3Ff8yCnoVMItkoCyolmYwJai+M9ReL82pO4N7Z1UteQLBKFSBPjwpYX2
morwibTVQw==
=qf3b
-END PGP MESSAGE-


bin5JJCCLBvfB.bin
Description: application/pgp-encrypted

-
To unsubscribe, e-mail: user-unsubscr...@hadoop.apache.org
For additional commands, e-mail: user-h...@hadoop.apache.org

[How to :]Apache Impala and S3n File System

2016-09-19 Thread Divya Gehlot

Hi,

Has any body tried using the Apache Impala with S3n File system.
Could share me the pros and cons of it.
Appreciate the help.


Thanks,
Divya

Re: Impala

2016-03-09 Thread Juri Yanase Triantaphyllou

Thanks. I will do it!

Juri

-Original Message-
From: Sean Busbey 
To: Nagalingam, Karthikeyan 
Cc: Kumar Jayapal ; user ; 
cdh-user 
Sent: Wed, Mar 9, 2016 12:53 pm
Subject: Re: Impala

You should join the mailing list for Apache Impala (incubating) and ask your 
question over there:

http://mail-archives.apache.org/mod_mbox/incubator-impala-dev/

On Wed, Mar 9, 2016 at 8:12 AM, Nagalingam, Karthikeyan 
 wrote:

Hello,

I am new to impala, my goal is to test join, aggregation against 2Million and 
10Million records. can you please provide some documentation or website for 
starter ?

Regards,
Karthikeyan Nagalingam,
Technical Marketing Engineer ( Big Data Analytics)
Mobile: 919-376-6422

-- 

busbey

Re: Impala

2016-03-09 Thread Sean Busbey

You should join the mailing list for Apache Impala (incubating) and ask
your question over there:

http://mail-archives.apache.org/mod_mbox/incubator-impala-dev/

On Wed, Mar 9, 2016 at 8:12 AM, Nagalingam, Karthikeyan <
karthikeyan.nagalin...@netapp.com> wrote:

> Hello,
>
>
>
> I am new to impala, my goal is to test join, aggregation against 2Million
> and 10Million records. can you please provide some documentation or website
> for starter ?
>
>
>
> Regards,
>
> Karthikeyan Nagalingam,
>
> Technical Marketing Engineer ( Big Data Analytics)
>
> Mobile: 919-376-6422
>



-- 
busbey

Impala

2016-03-09 Thread Nagalingam, Karthikeyan

Hello,

I am new to impala, my goal is to test join, aggregation against 2Million and 
10Million records. can you please provide some documentation or website for 
starter ?

Regards,
Karthikeyan Nagalingam,
Technical Marketing Engineer ( Big Data Analytics)
Mobile: 919-376-6422

Unable to connect to Impala shell after updating the cluster to 5.5.1

2016-01-25 Thread Kumar Jayapal

Hi,
Did anyone had this issue for impala ?

I am Unable to connect to Impala shell after updating the cluster to 5.5.1

my cluster has kerberos and LDAP for authentication. When it try to connect
impala shell It displays a message

" LDAP credentials may not be sent over insecure connections. Enable SSL or
set --auth_creds_ok_in_clear "

impala-shell -i impala.jayapal.com -l -u jayapal

LDAP credentials may not be sent over insecure connections. Enable SSL or
set --auth_creds_ok_in_clear

If I use impala-shell -i impala.jayapal.com -l -u jayapal
--auth_creds_ok_in_clear

it allow me to connect.

Please let me know if anyone had resolved this issue.






Thanks
Jay

Impala returns error: "Bad status for request 5241: TGetOperationStatusResp"

2015-05-15 Thread g10ck

Hello,


I'm running CDH 5.2.1.

When I tried to execute impala query on huge tables via HUE UI I got such error:
Bad status for request 5241: 
TGetOperationStatusResp(status=TStatus(errorCode=None, errorMessage=None, 
sqlState=None, infoMessages=None, statusCode=0), operationState=5, 
errorMessage=None, sqlState=None, errorCode=None)

First, I've tried to execute the same query via impala-shell on one of my 
worker-nodes. I was confused that error message was:
Query did not have enough memory to get the minimum required buffers.
Backend 3:Memory Limit Exceeded

I've checked impalad startup options. Command that gives me `ps aux | grep 
impalad` is: 
/opt/cloudera/parcels/CDH-5.2.1-1.cdh5.2.1.p0.12/lib/impala/sbin-retail/impalad 
--flagfile=/var/run/cloudera-scm-agent/process/7492-impala-IMPALAD/impala-conf/impalad_flags

And there is the content of the above flag_file:
-beeswax_port=21000
-fe_port=21000
-be_port=22000
-llama_callback_port=28000
-hs2_port=21050
-enable_webserver=true
-mem_limit=128849018880 !!!
-webserver_port=25000
-max_result_cache_size=10
-state_store_subscriber_port=23000
-statestore_subscriber_timeout_seconds=30
-scratch_dirs=/disk1/impala/impalad,/disk10/impala/impalad,/disk2/impala/impalad,/disk3/impala/impalad,/disk4/impala/impalad,/disk5/impala/impalad,/disk6/impala/impalad,/disk7/impala/impalad,/disk8/impala/impalad,/disk9/impala/impalad,/opt/impala/impalad,/disk11/impala/impalad,/disk12/impala/impalad
-default_query_options !!!
-log_filename=impalad
-hostname=my_hostname
-state_store_host=my_host
-local_nodemanager_url=my_host:8042
-llama_host=my_host
-llama_port=15000
-enable_rm=true
-pool_conf_file=pool-acls.txt
-cgroup_hierarchy_path=/var/run/cloudera-scm-agent/cgroups/cpu/hadoop-yarn
-state_store_port=24000
-catalog_service_host=my_host
-catalog_service_port=26000
-local_library_dir=/var/lib/impala/udfs
-llama_max_request_attempts=5
-llama_registration_timeout_secs=30
-llama_registration_wait_secs=3
-fair_scheduler_allocation_path=/var/run/cloudera-scm-agent/process/7494-impala-IMPALAD/impala-conf/fair-scheduler.xml
-llama_site_path=/var/run/cloudera-scm-agent/process/7494-impala-IMPALAD/impala-conf/llama-site.xml
-disk_spill_encryption=false

So, I've tried to change -default_query_options to: 
-default_query_options=mem_limit=128849018880

Now, if I do 'set;' in HUE or impala-shell I see:
 MEM_LIMIT: [128849018880]
Before there was value "0". My queries were succesfully executed.

Can you explain, why I have to set mem_limit under parameter 
"-default_query_options"? I thought that default memory limit for impalad is 
set by "mem_limit" option

Impala returns error: "Bad status for request 5241: TGetOperationStatusResp"

2015-05-15 Thread Georgy

Hello,

I'm running *CDH 5.2.1.*

When I tried to execute impala query on huge tables via HUE UI I got such
error:
*Bad status for request 5241:
TGetOperationStatusResp*(status=TStatus(errorCode=None,
errorMessage=None, sqlState=None, infoMessages=None, statusCode=0),
operationState=5, errorMessage=None, sqlState=None, errorCode=None)

First, I've tried to execute the same query via *impala-shell* on one of my
worker-nodes. I was confused that error message was:
Query did not have enough memory to get the minimum required buffers.
Backend 3:*Memory Limit Exceeded*

I've checked impalad startup options. Command that gives me `ps aux | grep
impalad` is:
/opt/cloudera/parcels/CDH-5.2.1-1.cdh5.2.1.p0.12/lib/impala/sbin-retail/impalad
--flagfile=/var/run/cloudera-scm-agent/process/7492-impala-IMPALAD/impala-conf/impalad_flags

And there is the content of the above flag_file:
-beeswax_port=21000
-fe_port=21000
-be_port=22000
-llama_callback_port=28000
-hs2_port=21050
-enable_webserver=true
*-mem_limit=128849018880*
-webserver_port=25000
-max_result_cache_size=10
-state_store_subscriber_port=23000
-statestore_subscriber_timeout_seconds=30
-scratch_dirs=/disk1/impala/impalad,/disk10/impala/impalad,/disk2/impala/impalad,/disk3/impala/impalad,/disk4/impala/impalad,/disk5/impala/impalad,/disk6/impala/impalad,/disk7/impala/impalad,/disk8/impala/impalad,/disk9/impala/impalad,/opt/impala/impalad,/disk11/impala/impalad,/disk12/impala/impalad
*-default_query_options*
-log_filename=impalad
-hostname=my_hostname
-state_store_host=my_host
-local_nodemanager_url=my_host:8042
-llama_host=my_host
-llama_port=15000
-enable_rm=true
-pool_conf_file=pool-acls.txt
-cgroup_hierarchy_path=/var/run/cloudera-scm-agent/cgroups/cpu/hadoop-yarn
-state_store_port=24000
-catalog_service_host=my_host
-catalog_service_port=26000
-local_library_dir=/var/lib/impala/udfs
-llama_max_request_attempts=5
-llama_registration_timeout_secs=30
-llama_registration_wait_secs=3
-fair_scheduler_allocation_path=/var/run/cloudera-scm-agent/process/7494-impala-IMPALAD/impala-conf/fair-scheduler.xml
-llama_site_path=/var/run/cloudera-scm-agent/process/7494-impala-IMPALAD/impala-conf/llama-site.xml
-disk_spill_encryption=false

So, I've tried to change *-default_query_options *to:
*-default_query_options=mem_limit=**128849018880*

Now, if I do 'set;' in HUE or impala-shell I see:
* MEM_LIMIT: [128849018880]*
Before there was value "0". My queries were succesfully executed.

Can you explain, why I have to set mem_limit under parameter "
*-default_query_options*"? I thought that default memory limit for impalad
is set by "mem_limit" option

-- 
Regards,
Georgy

Impala returns error: "Bad status for request 5241: TGetOperationStatusResp"

2015-05-14 Thread Георгий Безруких

Hello,

I'm running *CDH 5.2.1.*

When I tried to execute impala query on huge tables via HUE UI I got such
error:
*Bad status for request 5241:
TGetOperationStatusResp*(status=TStatus(errorCode=None,
errorMessage=None, sqlState=None, infoMessages=None, statusCode=0),
operationState=5, errorMessage=None, sqlState=None, errorCode=None)

First, I've tried to execute the same query via *impala-shell* on one of my
worker-nodes. I was confused that error message was:
Query did not have enough memory to get the minimum required buffers.
Backend 3:*Memory Limit Exceeded*

I've checked impalad startup options. Command that gives me `ps aux | grep
impalad` is:
/opt/cloudera/parcels/CDH-5.2.1-1.cdh5.2.1.p0.12/lib/impala/sbin-retail/impalad
--flagfile=/var/run/cloudera-scm-agent/process/7492-impala-IMPALAD/impala-conf/impalad_flags

And there is the content of the above flag_file:
-beeswax_port=21000
-fe_port=21000
-be_port=22000
-llama_callback_port=28000
-hs2_port=21050
-enable_webserver=true
*-mem_limit=128849018880*
-webserver_port=25000
-max_result_cache_size=10
-state_store_subscriber_port=23000
-statestore_subscriber_timeout_seconds=30
-scratch_dirs=/disk1/impala/impalad,/disk10/impala/impalad,/disk2/impala/impalad,/disk3/impala/impalad,/disk4/impala/impalad,/disk5/impala/impalad,/disk6/impala/impalad,/disk7/impala/impalad,/disk8/impala/impalad,/disk9/impala/impalad,/opt/impala/impalad,/disk11/impala/impalad,/disk12/impala/impalad
*-default_query_options*
-log_filename=impalad
-hostname=my_hostname
-state_store_host=my_host
-local_nodemanager_url=my_host:8042
-llama_host=my_host
-llama_port=15000
-enable_rm=true
-pool_conf_file=pool-acls.txt
-cgroup_hierarchy_path=/var/run/cloudera-scm-agent/cgroups/cpu/hadoop-yarn
-state_store_port=24000
-catalog_service_host=my_host
-catalog_service_port=26000
-local_library_dir=/var/lib/impala/udfs
-llama_max_request_attempts=5
-llama_registration_timeout_secs=30
-llama_registration_wait_secs=3
-fair_scheduler_allocation_path=/var/run/cloudera-scm-agent/process/7494-impala-IMPALAD/impala-conf/fair-scheduler.xml
-llama_site_path=/var/run/cloudera-scm-agent/process/7494-impala-IMPALAD/impala-conf/llama-site.xml
-disk_spill_encryption=false

So, I've tried to change *-default_query_options *to:
*-default_query_options=mem_limit=**128849018880*

Now, if I do 'set;' in HUE or impala-shell I see:
* MEM_LIMIT: [128849018880]*
Before there was value "0". My queries were succesfully executed.

Can you explain, why I have to set mem_limit under parameter "
*-default_query_options*"? I thought that default memory limit for impalad
is set by "mem_limit" option.

-- 
Regards,
Georgy

Re: Impala CREATE TABLE AS AVRO Requires "Redundant" Schema - Why?

2015-02-26 Thread Alexander Alten-Lorenz

Hi,

Impala is a product of Cloudera. You might request help per:
https://groups.google.com/a/cloudera.org/forum/#!forum/impala-user 
<https://groups.google.com/a/cloudera.org/forum/#!forum/impala-user>

BR, 
 Alex


> On 26 Feb 2015, at 17:15, Vitale, Tom  wrote:
> 
> I used sqoop to import an MS SQL Server table into an Avro file on HDFS.  No 
> problem. Then I tried to create an external Impala table using the following 
> DDL:
>  
> CREATE EXTERNAL TABLE AvroTable
> STORED AS AVRO
> LOCATION '/tmp/AvroTable';
>  
> I got the error “ERROR: AnalysisException: Error loading Avro schema: No Avro 
> schema provided in SERDEPROPERTIES or TBLPROPERTIES for table: 
> default.AvroTable”
>  
> So I extracted the schema from the Avro file using the avro-tools-1.7.4.jar 
> (-getschema) into a JSON file, then per the recommendation above, changed the 
> DDL to point to it:
>  
> CREATE EXTERNAL TABLE AvroTable
> STORED AS AVRO
> LOCATION '/tmp/AvroTable'
> TBLPROPERTIES(
> 'serialization.format'='1',
> 
> 'avro.schema.url'='hdfs://...net/tmp/AvroTable.schema'
>  
> );
>  
> This worked fine.  But my question is, why do you have to do this?  The 
> schema is already in the Avro file – that’s where I got the JSON schema file 
> that I point to in the TBLPROPERTIES parameter!
>  
> Thanks, Tom
>  
> Tom Vitale
> CREDIT SUISSE
> Information Technology | Infra Arch & Strategy NY, KIVP
> Eleven Madison Avenue | 10010-3629 New York | United States
> Phone +1 212 538 0708
> thomas.vit...@credit-suisse.com <mailto:thomas.vit...@credit-suisse.com> | 
> www.credit-suisse.com <http://www.credit-suisse.com/>
>  
> 
> 
> 
> ==
> Please access the attached hyperlink for an important electronic 
> communications disclaimer:
> http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html 
> <http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html>
> ==

Impala CREATE TABLE AS AVRO Requires "Redundant" Schema - Why?

2015-02-26 Thread Vitale, Tom

I used sqoop to import an MS SQL Server table into an Avro file on HDFS.  No 
problem. Then I tried to create an external Impala table using the following 
DDL:

CREATE EXTERNAL TABLE AvroTable
STORED AS AVRO
LOCATION '/tmp/AvroTable';

I got the error "ERROR: AnalysisException: Error loading Avro schema: No Avro 
schema provided in SERDEPROPERTIES or TBLPROPERTIES for table: 
default.AvroTable"

So I extracted the schema from the Avro file using the avro-tools-1.7.4.jar 
(-getschema) into a JSON file, then per the recommendation above, changed the 
DDL to point to it:

CREATE EXTERNAL TABLE AvroTable
STORED AS AVRO
LOCATION '/tmp/AvroTable'
TBLPROPERTIES(
'serialization.format'='1',

'avro.schema.url'='hdfs://...net/tmp/AvroTable.schema'
);

This worked fine.  But my question is, why do you have to do this?  The schema 
is already in the Avro file - that's where I got the JSON schema file that I 
point to in the TBLPROPERTIES parameter!

Thanks, Tom

Tom Vitale
CREDIT SUISSE
Information Technology | Infra Arch & Strategy NY, KIVP
Eleven Madison Avenue | 10010-3629 New York | United States
Phone +1 212 538 0708
thomas.vit...@credit-suisse.com<mailto:thomas.vit...@credit-suisse.com> | 
www.credit-suisse.com<http://www.credit-suisse.com>




=== 
Please access the attached hyperlink for an important electronic communications 
disclaimer: 
http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html 
===

Fwd: External Table creation in hive fails on impala integration with hive

2013-10-23 Thread Sathish Kumar

-- Forwarded message --
From: Sathish Kumar 
Date: Wed, Oct 23, 2013 at 10:28 AM
Subject: Re: External Table creation in hive fails on impala integration
with hive
To: cdh-u...@cloudera.org


Hi All,

Thanks Saro, It worked.

I have small doubt if my my Row Key and Value is as below what  will be the
data type we suppose to use next to TABLE *(create external TABLE
hbase_table(key int, value string)
*
ROW
COLUMN+CELL

 \x00\x00\x01As\xBDJ column=d:a, timestamp=1380629572482,
value=\x1F\x8B\x08\x08cn

Regards
Sathish*
*


On Wed, Oct 23, 2013 at 1:26 AM, Saro saravanan wrote:

> hi
>
> set hbase.zookeeper.quorum=localhost;
>
> create external TABLE hbase_table(key int, value string) STORED BY
> 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH
> SERDEPROPERTIES("hbase.columns.mapping" = ":key,datahive1:name")
> TBLPROPERTIES("hbase.table.name" = "tablename");
>
>
>
> On Wed, Oct 23, 2013 at 8:22 AM, Sathish Kumar  wrote:
>
>>
>>
>> -- Forwarded message --
>> From: Sathish Kumar 
>> Date: Tue, Oct 22, 2013 at 4:59 PM
>> Subject: External Table creation in hive fails on impala integration with
>> hive
>> To: cdh-u...@cloudera.org
>>
>>
>>  Hi All,
>>
>> I am trying to integrate impala with hbase, Received a syntax error as
>> mention below.
>>
>> ERROR: AnalysisException: Syntax error at:
>> create EXTERNAL TABLE hbase_table_2(key int, value int, value2 string)
>> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH
>> SERDEPROPERTIES ("hbase.columns.mapping" = "d:val") TBLPROPERTIES("
>> hbase.table.name" = "xyz")
>>
>>
>> In the above command, I suspect the column name "val"  is wrong, by
>> giving the "describe tablename" command I am able find the column family
>> name but not sure about how to find the column name.
>>
>> Please help me if you find any thing wrong in my command.
>>
>> Regards
>> Sathish
>>
>>
>>  --
>>
>> ---
>> You received this message because you are subscribed to the Google Groups
>> "CDH Users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to cdh-user+unsubscr...@cloudera.org.
>> For more options, visit
>> https://groups.google.com/a/cloudera.org/groups/opt_out.
>>
>
>
>
> --
> Thanks
>  *saravanan*
> *9095260692*
>
>
> --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "CDH Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to cdh-user+unsubscr...@cloudera.org.
> For more options, visit
> https://groups.google.com/a/cloudera.org/groups/opt_out.
>

Fwd: External Table creation in hive fails on impala integration with hive

2013-10-22 Thread Sathish Kumar

-- Forwarded message --
From: Sathish Kumar 
Date: Tue, Oct 22, 2013 at 4:59 PM
Subject: External Table creation in hive fails on impala integration with
hive
To: cdh-u...@cloudera.org

Hi All,

I am trying to integrate impala with hbase, Received a syntax error as
mention below.

ERROR: AnalysisException: Syntax error at:
create EXTERNAL TABLE hbase_table_2(key int, value int, value2 string)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH
SERDEPROPERTIES ("hbase.columns.mapping" = "d:val") TBLPROPERTIES("
hbase.table.name" = "xyz")

In the above command, I suspect the column name "val"  is wrong, by giving
the "describe tablename" command I am able find the column family name but
not sure about how to find the column name.

Please help me if you find any thing wrong in my command.

Regards
Sathish

Re: impala

2013-08-26 Thread Nitin Pawar

read
http://blog.cloudera.com/blog/2012/10/cloudera-impala-real-time-queries-in-apache-hadoop-for-real/


On Mon, Aug 26, 2013 at 3:01 PM, Ram  wrote:

>
> Hi,
> Any one can suggest the following.
>
> how exactly Impala works?  What happens when you submit a query?  How the
> data will be transferred to different nodes?
>
>
> From,
> Ramesh.
>
>
>


-- 
Nitin Pawar

impala

2013-08-26 Thread Ram

Hi,
Any one can suggest the following.

how exactly Impala works?  What happens when you submit a query?  How the
data will be transferred to different nodes?


From,
Ramesh.

Need help with cluster setup for performance [Impala]

2013-01-23 Thread Steven Wong

My apologies for sending this message to this group, but I'm having trouble 
sending to the right group.

From: Steven Wong
Sent: Wednesday, January 23, 2013 11:15 AM
To: impala-u...@cloudera.org
Subject: RE: Need help with cluster setup for performance

Thanks for the suggestions. The /metrics output looks good now, and the SELECT 
COUNT(*) runs much faster than before.

But I still have the "Unknown disk id" error message. My CDH version is:

 hadoop-clientx86_64 2.0.0+552-1.cdh4.1.2.p0.27.el5 cloudera-cdh4  18 k
 hadoop-mapreduce x86_64 2.0.0+552-1.cdh4.1.2.p0.27.el5 cloudera-cdh4 9.8 M
 hadoop-yarn  x86_64 2.0.0+552-1.cdh4.1.2.p0.27.el5 cloudera-cdh4 8.9 M

On Tuesday, January 22, 2013 5:37:30 PM UTC-8, Henry wrote:
On 22 January 2013 11:40, Steven Wong  wrote:
Hi,

I followed http://zenfractal.com/2012/11/15/from-zero-to-impala-in-minutes/ to 
set up a cluster on EC2. After seeing disappointing performance numbers from a 
SELECT COUNT(*), I am following 
https://ccp.cloudera.com/display/IMPALA10BETADOC/Configuring+Impala+for+Performance#ConfiguringImpalaforPerformance-TestingImpalaforHighPerformanceConfiguration
 to check my cluster setup. Questions:

1. My cluster has 3 data nodes. Is the following 
http://:/metrics output good?

statestore.backend.state.map:
{
  127.0.0.1:23000<http://127.0.0.1:23000/> : OK
}
statestore.live.backends:3
statestore.live.backends.list:[127.0.0.1:22000<http://127.0.0.1:22000/>]

Hi Steven -

This looks like your problem. Your machines are registering themselves with 
'localhost' as their hostname, and this means that they all look the same to 
the statestore.

I looked at Matt's zero-to-impala link - it's awesome, but now a little out of 
date. You should modify where you run impalad to also have --ipaddress and 
--hostname correctly set for each node. Then check the statestore metrics; 
things should look a lot better and your performance should improve.

2. My impalad logs contain "Unknown disk id.  This will negatively affect 
performance.  Check your hdfs settings to enable block location metadata." and 
my http://:/varz doesn't contain the string 
"dfs.datanode.hdfs-blocks-metadata.enabled". But my hdfs-site.xml sets 
dfs.datanode.hdfs-blocks-metadata.enabled to true. Why?

What version of CDH are you using?

3. My impalad.out doesn't contain "Unable to load native-hadoop library". This 
is good, I believe.

4. My impalad logs contain the following lines matching the word "scheduler", 
but none contains "locality percentage". Why?

The locality percentage is printed only for GLOG_v=1 - and I note that the 
setup-impala.sh script has  a typo where it has GVLOG_v=1. If you fix this, you 
should see the locality percentage.

Hope this helps - let us know if things improve.

Henry

/tmp/impalad.INFO:I0122 00:19:09.137197  5121 simple-scheduler.cc:82] Starting 
simple scheduler
/tmp/impalad.ip-10-170-17-154.impala.log.INFO.20130122-001901.5121:I0122 
00:19:09.137197  5121 simple-scheduler.cc:82] Starting simple scheduler

Thanks.
Steven

--

--
Henry Robinson
Software Engineer
Cloudera
415-994-6679

Re: [How to :]Apache Impala and S3n File System

[How to :]Apache Impala and S3n File System

Re: Impala

Re: Impala

Impala

Unable to connect to Impala shell after updating the cluster to 5.5.1

Impala returns error: "Bad status for request 5241: TGetOperationStatusResp"

Impala returns error: "Bad status for request 5241: TGetOperationStatusResp"

Impala returns error: "Bad status for request 5241: TGetOperationStatusResp"

Re: Impala CREATE TABLE AS AVRO Requires "Redundant" Schema - Why?

Impala CREATE TABLE AS AVRO Requires "Redundant" Schema - Why?

Fwd: External Table creation in hive fails on impala integration with hive

Fwd: External Table creation in hive fails on impala integration with hive

Re: impala

impala

Need help with cluster setup for performance [Impala]

16 matches

Site Navigation

Mail list logo

Footer information