Re: big issue on retrieving 400MB data

2017-04-28 Thread yu feng
We has faced this problem before with JDBC, this query speed too much time
on fetching result, and the query is running. which will occupy slot of
request pool(when we configured admission control), we have to cancel the
query after a Certain time time such as 5 minutes.

In our solution, JDBC fetch result with default fetchSize = 50, we try to
set fetchSize = 1(by call setFetchSize()) and the acceleration is
obvious(in our test, result size is 2000+W and this can speed up from more
than 1 hour to 500 seconds.).

2017-04-29 0:39 GMT+08:00 Mostafa Mokhtar :

> Hi Jeszy
>
>
> One option for large data extracts is to create a text table in HDFS then
> use "hdfs dfs -copyToLocal
>  FileSystemShell.html#copyToLocal>"
> to extract the text file(s), for large data dumps storing the data in HDFS
> then extracting it is likely to be faster than impala-shell or JDCB.
>
> When using impala-shell it is best practice to use -B to speedup fetching
> the results.
>
> Below are the results for two experiments, storing the results in an HDFS
> text table and writing the out of HDFS file took 14 seconds compared to 40
> seconds when using just impala-shell.
>
> Lineitem table was used, output data size is 390MB which is comparable to
> your use case.
>
> *Write to HDFS then extract*
>
> time (impala-shell -q "drop table if exists tpch_300_parquet.lineitem_3m;
> create table tpch_300_parquet.lineitem_3m stored as textfile as select *
> from tpch_300_parquet.lineitem limit 300;" ; hdfs dfs -copyToLocal
> hdfs://
> vd1315.domain.com:8020/user/hive/warehouse/tpch_300_parquet.db/lineitem_3m
> .; du -sh lineitem_3m )
> Starting Impala Shell without Kerberos authentication
> Connected to server1.domain.com:21000
> Server version: impalad version 2.9.0-SNAPSHOT RELEASE (build
> dc25acd558e9388290503e0e1100baac2b3c800b)
> Query: drop table if exists tpch_300_parquet.lineitem_3m
> Query: create table tpch_300_parquet.lineitem_3m stored as textfile as
> select * from tpch_300_parquet.lineitem limit 300
> Query submitted at: 2017-04-28 09:26:23 (Coordinator:
> http://server1.domain.com:25000)
> Query progress can be monitored at:
> http://server1.domain.com:25000/query_plan?query_id=a24f86518c3de1ce:
> 2f35ef9b
> +-+
> | summary |
> +-+
> | Inserted 300 row(s) |
> +-+
> Fetched 1 row(s) in 11.23s
> 390M lineitem_3m
>
> *real 0m14.392s*
> user 0m3.503s
> sys 0m0.655s
>
>
> *Extract results using impala-shell*
>
> [mmokhtar@server1 ~]$ time (impala-shell -B -q "select * From
> tpch_300_parquet.lineitem_3m " > lineitem_3m_impala_shell.txt; du -sh
> lineitem_3m_impala_shell.txt; )
> Starting Impala Shell without Kerberos authentication
> Connected to server1.domain.com:21000
> Server version: impalad version 2.9.0-SNAPSHOT RELEASE (build
> dc25acd558e9388290503e0e1100baac2b3c800b)
> Query: select * From tpch_300_parquet.lineitem_3m
> Query submitted at: 2017-04-28 09:27:39 (Coordinator:
> http://server1.domain.com:25000)
> Query progress can be monitored at:
> http://server1.domain.com:25000/query_plan?query_id=ee421b6d4a2226d3:
> 8acbb75f
> Fetched 300 row(s) in 38.88s
> 390M lineitem_3m_impala_shell.txt
>
> *real 0m39.152s*
> user 0m26.012s
> sys 0m0.668s
>
>
> On Fri, Apr 28, 2017 at 7:26 AM, Jeszy  wrote:
>
> > Hey,
> >
> > It looks like all the time is spent waiting for the client to fetch the
> > results:
> >  - ClientFetchWaitTimer: 17m31s
> >
> > Try doing:
> > impala-shell -B -q ''
> >
> > HTH
> >
> > 2017-04-28 14:51 GMT+02:00 吴朱华 :
> > > Maybe I just paste some main thing on mail , and congratulation on IPO
> > > thing.
> > >
> > > Unregister query: 17m42s (17m42s)
> > >
> > > Fetched 317246 row(s) in 1062.84s
> > > Query Runtime Profile:
> > > Query (id=8149e2439f43b15a:f08e570d7fbf1085):
> > >   Summary:
> > > Session ID: 35436d1112b79287:9045c79c795858a5
> > > Session Type: BEESWAX
> > > Start Time: 2017-04-28 11:50:00.292615000
> > > End Time: 2017-04-28 12:07:43.133484000
> > > Query Type: QUERY
> > > Query State: FINISHED
> > > Query Status: OK
> > > Impala Version: impalad version 2.5.0-cdh5-INTERNAL RELEASE (build
> > > 43880282edc04c03c162bbea6fc85b5388e7fdde)
> > > User: impala
> > > Connected User: impala
> > > Delegated User:
> > > Network Address: :::10.44.10.186:36325
> > > Default Db: sjzy
> > > Sql Statement: select
> > > MRECID,UNITID,PCQDM,PCQMC,PCXQDM,PCXQMC,DM,H001,H002,
> > H003,H021,H022,H023,H024,H025,H026A,H026B,H026C,H026D,H026E,
> > H026F,H026G,H027,H028,H029,H030,H031,H032,H033,H034,H035,
> > H036,H037A,H037B,H037C,H038,H039,H040,H041,H042,H043A,
> > H043B,H043C,H043D,H043E,H043F,H043G,H043H,H043I,H043J,H043K,
> > H043L,H044A,H044B,H044C,H044D,H044E,H044F,H044G,H044H,H044I,
> > 

Fwd: Github's disappearing mirrors

2017-04-28 Thread Jim Apple
-- Forwarded message --
From: Chris Lambertus 
Date: Fri, Apr 28, 2017 at 12:22 PM
Subject: Github's disappearing mirrors
To: committers 


Hello committers,

We have received quite a few reports of github mirrors gone missing. We’ve
tracked this down to an errant process at Github which appears to be
deleting
not only ours but also other orgs’ mirrors. We contacted Github but have
yet to
receive a reply. Another organization also contacted github and received the
following reply:

"Hi there, Sorry for the trouble! We've now had a couple of reports of this
problem, and we've opened an issue internally to investigate.  I don't have
an
ETA on a fix, but we'll be in touch if we need more information from you or
if
we have any information to share.  Regards, Laura GitHub Support”


We have no further information at this time. We have been restoring the
mirrors
wherever possible, but until the root cause is resolved on Github’s side, we
expect mirrors to continue to be erroneously removed.

Access to the repos via the usual https://git-wip-us.apache.org/ channel
remains functional.

-Chris
ASF Infra


signature.asc
Description: PGP signature


Re: incubator-impala github repo is gone ?

2017-04-28 Thread Michael Ho
https://git-wip-us.apache.org/repos/asf?p=incubator-impala.git is still
around.

On Fri, Apr 28, 2017 at 11:16 AM, Michael Ho  wrote:

> https://github.com/apache/incubator-impala seems to be empty now.
>
> Just filed https://issues.apache.org/jira/browse/INFRA-14038
>
> --
> Thanks,
> Michael
>



-- 
Thanks,
Michael


incubator-impala github repo is gone ?

2017-04-28 Thread Michael Ho
https://github.com/apache/incubator-impala seems to be empty now.

Just filed https://issues.apache.org/jira/browse/INFRA-14038

-- 
Thanks,
Michael


Re: big issue on retrieving 400MB data

2017-04-28 Thread Mostafa Mokhtar
Hi Jeszy


One option for large data extracts is to create a text table in HDFS then
use "hdfs dfs -copyToLocal
"
to extract the text file(s), for large data dumps storing the data in HDFS
then extracting it is likely to be faster than impala-shell or JDCB.

When using impala-shell it is best practice to use -B to speedup fetching
the results.

Below are the results for two experiments, storing the results in an HDFS
text table and writing the out of HDFS file took 14 seconds compared to 40
seconds when using just impala-shell.

Lineitem table was used, output data size is 390MB which is comparable to
your use case.

*Write to HDFS then extract*

time (impala-shell -q "drop table if exists tpch_300_parquet.lineitem_3m;
create table tpch_300_parquet.lineitem_3m stored as textfile as select *
from tpch_300_parquet.lineitem limit 300;" ; hdfs dfs -copyToLocal
hdfs://
vd1315.domain.com:8020/user/hive/warehouse/tpch_300_parquet.db/lineitem_3m
.; du -sh lineitem_3m )
Starting Impala Shell without Kerberos authentication
Connected to server1.domain.com:21000
Server version: impalad version 2.9.0-SNAPSHOT RELEASE (build
dc25acd558e9388290503e0e1100baac2b3c800b)
Query: drop table if exists tpch_300_parquet.lineitem_3m
Query: create table tpch_300_parquet.lineitem_3m stored as textfile as
select * from tpch_300_parquet.lineitem limit 300
Query submitted at: 2017-04-28 09:26:23 (Coordinator:
http://server1.domain.com:25000)
Query progress can be monitored at:
http://server1.domain.com:25000/query_plan?query_id=a24f86518c3de1ce:2f35ef9b
+-+
| summary |
+-+
| Inserted 300 row(s) |
+-+
Fetched 1 row(s) in 11.23s
390M lineitem_3m

*real 0m14.392s*
user 0m3.503s
sys 0m0.655s


*Extract results using impala-shell*

[mmokhtar@server1 ~]$ time (impala-shell -B -q "select * From
tpch_300_parquet.lineitem_3m " > lineitem_3m_impala_shell.txt; du -sh
lineitem_3m_impala_shell.txt; )
Starting Impala Shell without Kerberos authentication
Connected to server1.domain.com:21000
Server version: impalad version 2.9.0-SNAPSHOT RELEASE (build
dc25acd558e9388290503e0e1100baac2b3c800b)
Query: select * From tpch_300_parquet.lineitem_3m
Query submitted at: 2017-04-28 09:27:39 (Coordinator:
http://server1.domain.com:25000)
Query progress can be monitored at:
http://server1.domain.com:25000/query_plan?query_id=ee421b6d4a2226d3:8acbb75f
Fetched 300 row(s) in 38.88s
390M lineitem_3m_impala_shell.txt

*real 0m39.152s*
user 0m26.012s
sys 0m0.668s


On Fri, Apr 28, 2017 at 7:26 AM, Jeszy  wrote:

> Hey,
>
> It looks like all the time is spent waiting for the client to fetch the
> results:
>  - ClientFetchWaitTimer: 17m31s
>
> Try doing:
> impala-shell -B -q ''
>
> HTH
>
> 2017-04-28 14:51 GMT+02:00 吴朱华 :
> > Maybe I just paste some main thing on mail , and congratulation on IPO
> > thing.
> >
> > Unregister query: 17m42s (17m42s)
> >
> > Fetched 317246 row(s) in 1062.84s
> > Query Runtime Profile:
> > Query (id=8149e2439f43b15a:f08e570d7fbf1085):
> >   Summary:
> > Session ID: 35436d1112b79287:9045c79c795858a5
> > Session Type: BEESWAX
> > Start Time: 2017-04-28 11:50:00.292615000
> > End Time: 2017-04-28 12:07:43.133484000
> > Query Type: QUERY
> > Query State: FINISHED
> > Query Status: OK
> > Impala Version: impalad version 2.5.0-cdh5-INTERNAL RELEASE (build
> > 43880282edc04c03c162bbea6fc85b5388e7fdde)
> > User: impala
> > Connected User: impala
> > Delegated User:
> > Network Address: :::10.44.10.186:36325
> > Default Db: sjzy
> > Sql Statement: select
> > MRECID,UNITID,PCQDM,PCQMC,PCXQDM,PCXQMC,DM,H001,H002,
> H003,H021,H022,H023,H024,H025,H026A,H026B,H026C,H026D,H026E,
> H026F,H026G,H027,H028,H029,H030,H031,H032,H033,H034,H035,
> H036,H037A,H037B,H037C,H038,H039,H040,H041,H042,H043A,
> H043B,H043C,H043D,H043E,H043F,H043G,H043H,H043I,H043J,H043K,
> H043L,H044A,H044B,H044C,H044D,H044E,H044F,H044G,H044H,H044I,
> H050,H051,H052,H053,H054,H055,H056,H061,H062,H063,H064,H065,
> H066,H070,H071,H072,H073,H074,H075,H080,H100,H111,H112,H113,
> H120,H200,H201,H202,H203,H204,H205,H206,H207,H208,H209,H210,
> H211,H300,H320,H321,H322,H323,H324,H400,H401,H402,H403,H404,
> H405,H406,H500,H600,H601,H602,H603,H604,H605,H606,H607,H608,
> H609,H610,H611,H612,H613,H614,H615,H616,H621A,H621B,H621C,
> H621D,H621E,H621F,H622A,H622B,H622C,H801,H802,H803,H804,H901,H902,H903
> > FROM NP_2017_NP601 WHERE DS_AREACODE LIKE '445281%'
> > Coordinator: node1.sky.org:22000
> > Query Options (non default):
> > Plan:
> > 
> > Estimated Per-Host Requirements: Memory=4.50GB VCores=1
> >
> > 01:EXCHANGE [UNPARTITIONED]
> > |  hosts=4 per-host-mem=unavailable
> > |  tuple-ids=0 row-size=1.67KB cardinality=1155911
> > |
> > 00:SCAN 

Re: big issue on retrieving 400MB data

2017-04-28 Thread Jeszy
Hey,

It looks like all the time is spent waiting for the client to fetch the results:
 - ClientFetchWaitTimer: 17m31s

Try doing:
impala-shell -B -q ''

HTH

2017-04-28 14:51 GMT+02:00 吴朱华 :
> Maybe I just paste some main thing on mail , and congratulation on IPO
> thing.
>
> Unregister query: 17m42s (17m42s)
>
> Fetched 317246 row(s) in 1062.84s
> Query Runtime Profile:
> Query (id=8149e2439f43b15a:f08e570d7fbf1085):
>   Summary:
> Session ID: 35436d1112b79287:9045c79c795858a5
> Session Type: BEESWAX
> Start Time: 2017-04-28 11:50:00.292615000
> End Time: 2017-04-28 12:07:43.133484000
> Query Type: QUERY
> Query State: FINISHED
> Query Status: OK
> Impala Version: impalad version 2.5.0-cdh5-INTERNAL RELEASE (build
> 43880282edc04c03c162bbea6fc85b5388e7fdde)
> User: impala
> Connected User: impala
> Delegated User:
> Network Address: :::10.44.10.186:36325
> Default Db: sjzy
> Sql Statement: select
> MRECID,UNITID,PCQDM,PCQMC,PCXQDM,PCXQMC,DM,H001,H002,H003,H021,H022,H023,H024,H025,H026A,H026B,H026C,H026D,H026E,H026F,H026G,H027,H028,H029,H030,H031,H032,H033,H034,H035,H036,H037A,H037B,H037C,H038,H039,H040,H041,H042,H043A,H043B,H043C,H043D,H043E,H043F,H043G,H043H,H043I,H043J,H043K,H043L,H044A,H044B,H044C,H044D,H044E,H044F,H044G,H044H,H044I,H050,H051,H052,H053,H054,H055,H056,H061,H062,H063,H064,H065,H066,H070,H071,H072,H073,H074,H075,H080,H100,H111,H112,H113,H120,H200,H201,H202,H203,H204,H205,H206,H207,H208,H209,H210,H211,H300,H320,H321,H322,H323,H324,H400,H401,H402,H403,H404,H405,H406,H500,H600,H601,H602,H603,H604,H605,H606,H607,H608,H609,H610,H611,H612,H613,H614,H615,H616,H621A,H621B,H621C,H621D,H621E,H621F,H622A,H622B,H622C,H801,H802,H803,H804,H901,H902,H903
> FROM NP_2017_NP601 WHERE DS_AREACODE LIKE '445281%'
> Coordinator: node1.sky.org:22000
> Query Options (non default):
> Plan:
> 
> Estimated Per-Host Requirements: Memory=4.50GB VCores=1
>
> 01:EXCHANGE [UNPARTITIONED]
> |  hosts=4 per-host-mem=unavailable
> |  tuple-ids=0 row-size=1.67KB cardinality=1155911
> |
> 00:SCAN HDFS [sjzy.np_2017_np601, RANDOM]
>partitions=1/1 files=20 size=1.06GB
>predicates: DS_AREACODE LIKE '445281%'
>table stats: 11559109 rows total
>column stats: all
>hosts=4 per-host-mem=4.50GB
>tuple-ids=0 row-size=1.67KB cardinality=1155911
> 
> Estimated Per-Host Mem: 4831838208
> Estimated Per-Host VCores: 1
> Request Pool: default-pool
> ExecSummary:
> Operator   #Hosts  Avg Time  Max Time#Rows  Est. #Rows   Peak Mem
>  Est. Peak Mem  Detail
> -
> 01:EXCHANGE 1  32.314ms  32.314ms  317.25K   1.16M  0
>  -1.00 B  UNPARTITIONED
> 00:SCAN HDFS   20   1s137ms   1s348ms  317.25K   1.16M  163.85 MB
>  4.50 GB  sjzy.np_2017_np601
> Planner Timeline: 53.683ms
>- Analysis finished: 24.565ms (24.565ms)
>- Equivalence classes computed: 26.389ms (1.823ms)
>- Single node plan created: 33.607ms (7.218ms)
>- Runtime filters computed: 33.684ms (76.568us)
>- Distributed plan created: 39.125ms (5.441ms)
>- Planning finished: 53.683ms (14.558ms)
> Query Timeline: 17m42s
>- Start execution: 43.792us (43.792us)
>- Planning finished: 60.640ms (60.596ms)
>- Ready to start 20 remote fragments: 65.111ms (4.471ms)
>- All 20 remote fragments started: 74.572ms (9.461ms)
>- Rows available: 744.300ms (669.728ms)
>- First row fetched: 790.128ms (45.828ms)
>- Unregister query: 17m42s (17m42s)
>   ImpalaServer:
>  - ClientFetchWaitTimer: 17m31s
>  - RowMaterializationTimer: 10s024ms
>
> 2017-04-28 19:44 GMT+08:00 Jim Apple :
>
>> dev@ does not appear to accept attachments. You can upload it somewhere
>> and
>> post a link, though.
>>
>> On Thu, Apr 27, 2017 at 11:35 PM, 吴朱华  wrote:
>>
>> > Oops, I just resend it, you know the chinese network^_^
>> >
>> > 2017-04-28 14:20 GMT+08:00 Mostafa Mokhtar :
>> >
>> >> Btw the profile wasn't attached.
>> >> Please resend.
>> >>
>> >> On Thu, Apr 27, 2017 at 11:11 PM, 吴朱华  wrote:
>> >>
>> >>> Profile is in the attachment, thanks
>> >>>
>> >>>
>> >>> 2017-04-28 13:10 GMT+08:00 Dimitris Tsirogiannis <
>> >>> dtsirogian...@cloudera.com>:
>> >>>
>>  Maybe you also want to post some information about the schema (how
>> wide
>>  your table is, does it use nested types, etc) as well as the profile
>> of
>>  the
>>  slow query.
>> 
>>  Dimitris
>> 
>>  On Thu, Apr 27, 2017 at 9:30 PM, 吴朱华  wrote:
>> 
>>  > Hi guys:
>>  > we can facing a big issue when select * from a big table.
>>  > The performance is 17minutes for retrieving 400MB data. Even slow
>> 

Re: big issue on retrieving 400MB data

2017-04-28 Thread 吴朱华
Maybe I just paste some main thing on mail , and congratulation on IPO
thing.

Unregister query: 17m42s (17m42s)

Fetched 317246 row(s) in 1062.84s
Query Runtime Profile:
Query (id=8149e2439f43b15a:f08e570d7fbf1085):
  Summary:
Session ID: 35436d1112b79287:9045c79c795858a5
Session Type: BEESWAX
Start Time: 2017-04-28 11:50:00.292615000
End Time: 2017-04-28 12:07:43.133484000
Query Type: QUERY
Query State: FINISHED
Query Status: OK
Impala Version: impalad version 2.5.0-cdh5-INTERNAL RELEASE (build
43880282edc04c03c162bbea6fc85b5388e7fdde)
User: impala
Connected User: impala
Delegated User:
Network Address: :::10.44.10.186:36325
Default Db: sjzy
Sql Statement: select
MRECID,UNITID,PCQDM,PCQMC,PCXQDM,PCXQMC,DM,H001,H002,H003,H021,H022,H023,H024,H025,H026A,H026B,H026C,H026D,H026E,H026F,H026G,H027,H028,H029,H030,H031,H032,H033,H034,H035,H036,H037A,H037B,H037C,H038,H039,H040,H041,H042,H043A,H043B,H043C,H043D,H043E,H043F,H043G,H043H,H043I,H043J,H043K,H043L,H044A,H044B,H044C,H044D,H044E,H044F,H044G,H044H,H044I,H050,H051,H052,H053,H054,H055,H056,H061,H062,H063,H064,H065,H066,H070,H071,H072,H073,H074,H075,H080,H100,H111,H112,H113,H120,H200,H201,H202,H203,H204,H205,H206,H207,H208,H209,H210,H211,H300,H320,H321,H322,H323,H324,H400,H401,H402,H403,H404,H405,H406,H500,H600,H601,H602,H603,H604,H605,H606,H607,H608,H609,H610,H611,H612,H613,H614,H615,H616,H621A,H621B,H621C,H621D,H621E,H621F,H622A,H622B,H622C,H801,H802,H803,H804,H901,H902,H903
FROM NP_2017_NP601 WHERE DS_AREACODE LIKE '445281%'
Coordinator: node1.sky.org:22000
Query Options (non default):
Plan:

Estimated Per-Host Requirements: Memory=4.50GB VCores=1

01:EXCHANGE [UNPARTITIONED]
|  hosts=4 per-host-mem=unavailable
|  tuple-ids=0 row-size=1.67KB cardinality=1155911
|
00:SCAN HDFS [sjzy.np_2017_np601, RANDOM]
   partitions=1/1 files=20 size=1.06GB
   predicates: DS_AREACODE LIKE '445281%'
   table stats: 11559109 rows total
   column stats: all
   hosts=4 per-host-mem=4.50GB
   tuple-ids=0 row-size=1.67KB cardinality=1155911

Estimated Per-Host Mem: 4831838208
Estimated Per-Host VCores: 1
Request Pool: default-pool
ExecSummary:
Operator   #Hosts  Avg Time  Max Time#Rows  Est. #Rows   Peak Mem
 Est. Peak Mem  Detail
-
01:EXCHANGE 1  32.314ms  32.314ms  317.25K   1.16M  0
 -1.00 B  UNPARTITIONED
00:SCAN HDFS   20   1s137ms   1s348ms  317.25K   1.16M  163.85 MB
 4.50 GB  sjzy.np_2017_np601
Planner Timeline: 53.683ms
   - Analysis finished: 24.565ms (24.565ms)
   - Equivalence classes computed: 26.389ms (1.823ms)
   - Single node plan created: 33.607ms (7.218ms)
   - Runtime filters computed: 33.684ms (76.568us)
   - Distributed plan created: 39.125ms (5.441ms)
   - Planning finished: 53.683ms (14.558ms)
Query Timeline: 17m42s
   - Start execution: 43.792us (43.792us)
   - Planning finished: 60.640ms (60.596ms)
   - Ready to start 20 remote fragments: 65.111ms (4.471ms)
   - All 20 remote fragments started: 74.572ms (9.461ms)
   - Rows available: 744.300ms (669.728ms)
   - First row fetched: 790.128ms (45.828ms)
   - Unregister query: 17m42s (17m42s)
  ImpalaServer:
 - ClientFetchWaitTimer: 17m31s
 - RowMaterializationTimer: 10s024ms

2017-04-28 19:44 GMT+08:00 Jim Apple :

> dev@ does not appear to accept attachments. You can upload it somewhere
> and
> post a link, though.
>
> On Thu, Apr 27, 2017 at 11:35 PM, 吴朱华  wrote:
>
> > Oops, I just resend it, you know the chinese network^_^
> >
> > 2017-04-28 14:20 GMT+08:00 Mostafa Mokhtar :
> >
> >> Btw the profile wasn't attached.
> >> Please resend.
> >>
> >> On Thu, Apr 27, 2017 at 11:11 PM, 吴朱华  wrote:
> >>
> >>> Profile is in the attachment, thanks
> >>>
> >>>
> >>> 2017-04-28 13:10 GMT+08:00 Dimitris Tsirogiannis <
> >>> dtsirogian...@cloudera.com>:
> >>>
>  Maybe you also want to post some information about the schema (how
> wide
>  your table is, does it use nested types, etc) as well as the profile
> of
>  the
>  slow query.
> 
>  Dimitris
> 
>  On Thu, Apr 27, 2017 at 9:30 PM, 吴朱华  wrote:
> 
>  > Hi guys:
>  > we can facing a big issue when select * from a big table.
>  > The performance is 17minutes for retrieving 400MB data. Even slow
>  under
>  > JDBC situation.
>  > Is there anyway to improve it?^_^
>  >
> 
> >>>
> >>>
> >>
> >
>


Re: big issue on retrieving 400MB data

2017-04-28 Thread Jim Apple
dev@ does not appear to accept attachments. You can upload it somewhere and
post a link, though.

On Thu, Apr 27, 2017 at 11:35 PM, 吴朱华  wrote:

> Oops, I just resend it, you know the chinese network^_^
>
> 2017-04-28 14:20 GMT+08:00 Mostafa Mokhtar :
>
>> Btw the profile wasn't attached.
>> Please resend.
>>
>> On Thu, Apr 27, 2017 at 11:11 PM, 吴朱华  wrote:
>>
>>> Profile is in the attachment, thanks
>>>
>>>
>>> 2017-04-28 13:10 GMT+08:00 Dimitris Tsirogiannis <
>>> dtsirogian...@cloudera.com>:
>>>
 Maybe you also want to post some information about the schema (how wide
 your table is, does it use nested types, etc) as well as the profile of
 the
 slow query.

 Dimitris

 On Thu, Apr 27, 2017 at 9:30 PM, 吴朱华  wrote:

 > Hi guys:
 > we can facing a big issue when select * from a big table.
 > The performance is 17minutes for retrieving 400MB data. Even slow
 under
 > JDBC situation.
 > Is there anyway to improve it?^_^
 >

>>>
>>>
>>
>


Re: big issue on retrieving 400MB data

2017-04-28 Thread 吴朱华
Oops, I just resend it, you know the chinese network^_^

2017-04-28 14:20 GMT+08:00 Mostafa Mokhtar :

> Btw the profile wasn't attached.
> Please resend.
>
> On Thu, Apr 27, 2017 at 11:11 PM, 吴朱华  wrote:
>
>> Profile is in the attachment, thanks
>>
>>
>> 2017-04-28 13:10 GMT+08:00 Dimitris Tsirogiannis <
>> dtsirogian...@cloudera.com>:
>>
>>> Maybe you also want to post some information about the schema (how wide
>>> your table is, does it use nested types, etc) as well as the profile of
>>> the
>>> slow query.
>>>
>>> Dimitris
>>>
>>> On Thu, Apr 27, 2017 at 9:30 PM, 吴朱华  wrote:
>>>
>>> > Hi guys:
>>> > we can facing a big issue when select * from a big table.
>>> > The performance is 17minutes for retrieving 400MB data. Even slow under
>>> > JDBC situation.
>>> > Is there anyway to improve it?^_^
>>> >
>>>
>>
>>
>


Re: big issue on retrieving 400MB data

2017-04-28 Thread 吴朱华
Profile is in the attachment, thanks


2017-04-28 13:10 GMT+08:00 Dimitris Tsirogiannis :

> Maybe you also want to post some information about the schema (how wide
> your table is, does it use nested types, etc) as well as the profile of the
> slow query.
>
> Dimitris
>
> On Thu, Apr 27, 2017 at 9:30 PM, 吴朱华  wrote:
>
> > Hi guys:
> > we can facing a big issue when select * from a big table.
> > The performance is 17minutes for retrieving 400MB data. Even slow under
> > JDBC situation.
> > Is there anyway to improve it?^_^
> >
>