sparkR 3rd library

2017-09-03 Thread patcharee
ld not find function "rbga" at org.apache.spark.api.r.RRunner.compute(RRunner.scala:108) at org.apache.spark.api.r.BaseRRDD.compute(RRDD.scala:51) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala Any ide

simple application on tez + llap

2017-02-24 Thread Patcharee Thongtra
Hi, I found an example of simple applications like wordcount running on tez - https://github.com/apache/tez/tree/master/tez-examples/src/main/java/org/apache/tez/examples. However, how to run this on tez+llap? Any suggestions? BR, Patcharee

Re: import sql file

2016-11-23 Thread patcharee
I exported sql table into .sql file and would like to import this into hive Best, Patcharee On 23. nov. 2016 10:40, Markovitz, Dudu wrote: Hi Patcharee The question is not clear. Dudu -Original Message- From: patcharee [mailto:patcharee.thong...@uni.no] Sent: Wednesday, November 23

import sql file

2016-11-23 Thread patcharee
Hi, How can I import .sql file into hive? Best, Patcharee

Re: hiveserver2 java heap space

2016-10-24 Thread Patcharee Thongtra
It works on Hive cli Patcharee On 10/24/2016 11:51 AM, Mich Talebzadeh wrote: does this work ok through Hive cli? Dr Mich Talebzadeh LinkedIn /https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw/ http://talebzadehmich.wordpress.com *Disclaimer:* Use it at

hiveserver2 java heap space

2016-10-24 Thread Patcharee Thongtra
nder why I got this error because I query just ONE line. Any ideas? Thanks, Patcharee

hiveserver2 GC overhead limit exceeded

2016-10-23 Thread patcharee
rom org.apache.hadoop.hive.ql.exec.DDLTask. GC overhead limit exceeded (state=08S01,code=1) How to solve this? How to identify if this error is from the client (beeline) or from hiveserver2? Thanks, Patcharee

Re: Spark DataFrame Plotting

2016-09-08 Thread patcharee
Hi Moon, When I generate an extra column (schema will be Index:Int, A:Double, B:Double), what sql command to generate a graph with 2 lines (Index as a X-axis, BOTH A and B as Y-axis)? Do I need to group by? Thanks! Patcharee On 07. sep. 2016 16:58, moon soo Lee wrote: You will need to

Re: Spark DataFrame Plotting

2016-09-07 Thread patcharee
Normal select * gives me one column on X-axis and another on Y-axis. I cannot make both A:Double, B:Double displayed on Y-axis. How to do that? Patcharee On 07. sep. 2016 11:05, Abhisar Mohapatra wrote: You can do a normal select * on the dataframe and it would be automatically interpreted

Spark DataFrame Plotting

2016-09-07 Thread patcharee
Hi, I have a dataframe with this schema A:Double, B:Double. How can I plot this dataframe as two lines (comparing A and B at each step)? Best, Patcharee

what contribute to Task Deserialization Time

2016-07-21 Thread patcharee
vance! Patcharee - To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: Failed to stream on Yarn cluster

2016-04-28 Thread patcharee
is the taskmanager.out and how can I change it? Best, Patcharee On 28. april 2016 13:18, Maximilian Michels wrote: Hi Patcharee, What do you mean by "nothing happened"? There is no output? Did you check the logs? Cheers, Max On Thu, Apr 28, 2016 at 12:10 PM, patcharee wrote:

Failed to stream on Yarn cluster

2016-04-28 Thread patcharee
happened. Any ideas? I tested the word count example from hdfs file on Yarn cluster and it worked fine. Best, Patcharee

Re: pyspark split pair rdd to multiple

2016-04-20 Thread patcharee
I can also use dataframe. Any suggestions? Best, Patcharee On 20. april 2016 10:43, Gourav Sengupta wrote: Is there any reason why you are not using data frames? Regards, Gourav On Tue, Apr 19, 2016 at 8:51 PM, pth001 <mailto:patcharee.thong...@uni.no>> wrote: Hi, How ca

Re: build r-intepreter

2016-04-14 Thread Patcharee Thongtra
Yes, I did not install R. Stupid me. Thanks for your guide! BR, Patcharee On 04/13/2016 08:23 PM, Eric Charles wrote: Can you post the full stacktrace you have (look also at the log file)? Did you install R on your machine? SPARK_HOME is optional. On 13/04/16 15:39, Patcharee Thongtra wrote

executor running time vs getting result from jupyter notebook

2016-04-14 Thread Patcharee Thongtra
factor of time spending on these steps? BR, Patcharee - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org

Re: build r-intepreter

2016-04-13 Thread Patcharee Thongtra
spark for testing first. BR, Patcharee On 04/13/2016 02:52 PM, Patcharee Thongtra wrote: Hi, I have been struggling with R interpreter / SparkR interpreter. Is below the right command to build zeppelin with R interpreter / SparkR interpreter? mvn clean package -Pspark-1.6 -Phadoop-2.6

build r-intepreter

2016-04-13 Thread Patcharee Thongtra
Hi, I have been struggling with R interpreter / SparkR interpreter. Is below the right command to build zeppelin with R interpreter / SparkR interpreter? mvn clean package -Pspark-1.6 -Phadoop-2.6 -Pyarn -Ppyspark -Psparkr BR, Patcharee

ExclamationTopology workers executors vs tasks

2016-03-01 Thread patcharee
Also from the Storm UI, the Num executors and Num tasks of the Spout word and the Bolt exclaim1 and exclaim2 are 10, 3 and 2 respectively (as same as defined in the code). Thanks, Patcharee

kafka streaming topic partitions vs executors

2016-02-26 Thread patcharee
the topic's partitions). However some executors are given more than 1 tasks and work on these tasks sequentially. Why Spark does not distribute these 10 tasks to 10 executors? How to do that? Thanks, Patcharee

Pyspark filter not empty

2016-01-29 Thread patcharee
Hi, In pyspark how to filter if a column of dataframe is not empty? I tried: dfNotEmpty = df.filter(df['msg']!='') It did not work. Thanks, Patcharee - To unsubscribe, e-mail: user-unsubscr...@

Re: streaming textFileStream problem - got only ONE line

2016-01-29 Thread patcharee
I moved them every interval to the monitored directory. Patcharee On 25. jan. 2016 22:30, Shixiong(Ryan) Zhu wrote: Did you move the file into "hdfs://helmhdfs/user/patcharee/cerdata/", or write into it directly? `textFileStream` requires that files must be written to the monitored

streaming textFileStream problem - got only ONE line

2016-01-25 Thread patcharee
().print() The problem is sometimes the data received from scc.textFileStream is ONLY ONE line. But in fact there are multiple lines in the new file found in that interval. See log below which shows three intervals. In the 2nd interval, the new file is: hdfs://helmhdfs/user/patcharee/cerdata

spark streaming input rate strange

2016-01-22 Thread patcharee
raises up to 10,000, stays at 10,000 a while and drops to about 7000-8000. - When clients = 20,000 the event rate raises up to 20,000, stays at 20,000 a while and drops to about 15000-17000. The same pattern Processing time is just about 400 ms. Any ideas/suggestions? Thanks, Patcharee

visualize data from spark streaming

2016-01-20 Thread patcharee
Hi, How to visualize realtime data (in graph/chart) from spark streaming? Any tools? Best, Patcharee - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org

bad performance on PySpark - big text file

2015-12-08 Thread patcharee
log of these two input splits (check python.PythonRunner: Times: total ... ) 15/12/08 07:37:15 INFO rdd.NewHadoopRDD: Input split: hdfs://helmhdfs/user/patcharee/ntap-raw-20151015-20151126/html2/budisansblog.blogspot.com.html:39728447488+134217728 15/12/08 08:49:30 INFO python.PythonRunner

Re: Spark UI - Streaming Tab

2015-12-04 Thread patcharee
I ran streaming jobs, but no streaming tab appeared for those jobs. Patcharee On 04. des. 2015 18:12, PhuDuc Nguyen wrote: I believe the "Streaming" tab is dynamic - it appears once you have a streaming job running, not when the cluster is simply up. It does not depend on 1.6 and h

Spark UI - Streaming Tab

2015-12-04 Thread patcharee
need to configure the history UI somehow to get such interface? Thanks, Patcharee - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org

Spark applications metrics

2015-12-04 Thread patcharee
Hi How can I see the summary of data read / write, shuffle read / write, etc of an Application, not per stage? Thanks, Patcharee - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail

Re: Spark Streaming - History UI

2015-12-02 Thread patcharee
I meant there is no streaming tab at all. It looks like I need version 1.6 Patcharee On 02. des. 2015 11:34, Steve Loughran wrote: The history UI doesn't update itself for live apps (SPARK-7889) -though I'm working on it Are you trying to view a running streaming job? On 2 Dec 2

Spark Streaming - History UI

2015-12-01 Thread patcharee
Hi, On my history server UI, I cannot see "streaming" tab for any streaming jobs? I am using version 1.5.1. Any ideas? Thanks, Patcharee - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional c

spark streaming count msg in batch

2015-12-01 Thread patcharee
Hi, In spark streaming how to count the total number of message (from Socket) in one batch? Thanks, Patcharee - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h

custom inputformat recordreader

2015-11-26 Thread Patcharee Thongtra
Hi, In python how to use inputformat/custom recordreader? Thanks, Patcharee - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org

data local read counter

2015-11-25 Thread Patcharee Thongtra
Hi, Is there a counter for data local read? I understood that it is locality level counter, but it seems not. Thanks, Patcharee - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail

locality level counter

2015-11-25 Thread Patcharee Thongtra
? Thanks, Patcharee - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org

Re: query orc file by hive

2015-11-13 Thread patcharee
Hi, It works after I altered add partition. Thanks! My partitioned orc file (directory) is created by Spark, therefore hive is not aware of the partitions automatically. Best, Patcharee On 13. nov. 2015 13:08, Elliot West wrote: Have you added the partitions to the meta store? ALTER TABLE

Re: query orc file by hive

2015-11-13 Thread patcharee
Hi, It work with non-partition ORC, but does not work with (2-column) partitioned ORC. Thanks, Patcharee On 09. nov. 2015 10:55, Elliot West wrote: Hi, You can create a table and point the location property to the folder containing your ORC file: CREATE EXTERNAL TABLE orc_table

query orc file by hive

2015-11-09 Thread patcharee
Hi, How can I query an orc file (*.orc) by Hive? This orc file is created by other apps, like spark, mr. Thanks, Patcharee

[jira] [Comment Edited] (SPARK-11087) spark.sql.orc.filterPushdown does not work, No ORC pushdown predicate

2015-11-06 Thread patcharee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14993771#comment-14993771 ] patcharee edited comment on SPARK-11087 at 11/6/15 2:5

[jira] [Commented] (SPARK-11087) spark.sql.orc.filterPushdown does not work, No ORC pushdown predicate

2015-11-06 Thread patcharee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14993771#comment-14993771 ] patcharee commented on SPARK-11087: --- Hi, I found a scenario where the predicate

[jira] [Issue Comment Deleted] (SPARK-11087) spark.sql.orc.filterPushdown does not work, No ORC pushdown predicate

2015-11-06 Thread patcharee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] patcharee updated SPARK-11087: -- Comment: was deleted (was: Hi [~zzhan], the problem actually happens when I generates orc file by

[jira] [Issue Comment Deleted] (SPARK-11087) spark.sql.orc.filterPushdown does not work, No ORC pushdown predicate

2015-11-06 Thread patcharee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] patcharee updated SPARK-11087: -- Comment: was deleted (was: I found a scenario where the problem exists

[jira] [Commented] (SPARK-11087) spark.sql.orc.filterPushdown does not work, No ORC pushdown predicate

2015-11-06 Thread patcharee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14993398#comment-14993398 ] patcharee commented on SPARK-11087: --- Hi [~zzhan], the problem actually happens wh

[jira] [Reopened] (SPARK-11087) spark.sql.orc.filterPushdown does not work, No ORC pushdown predicate

2015-11-06 Thread patcharee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] patcharee reopened SPARK-11087: --- I found a scenario where the problem exists > spark.sql.orc.filterPushdown does not work, No

How to run parallel on each DataFrame group

2015-11-05 Thread patcharee
lem is each group after filtered is handled by an executor one by one. How to change the code to allow each group run in parallel? I looked at groupBy, but seem only for aggregation. Thanks, Patcharee

Min-Max Index vs Bloom filter

2015-11-02 Thread patcharee
Hi, For the orc format, which scenario that bloom filter is better than min-max index? Best, Patcharee

execute native system commands in Spark

2015-11-02 Thread patcharee
Hi, Is it possible to execute native system commands (in parallel) Spark, like scala.sys.process ? Best, Patcharee - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h

[jira] [Closed] (SPARK-11087) spark.sql.orc.filterPushdown does not work, No ORC pushdown predicate

2015-10-23 Thread patcharee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] patcharee closed SPARK-11087. - Resolution: Not A Problem The predicate is indeed generated and can be found in the executor log

[jira] [Commented] (SPARK-11087) spark.sql.orc.filterPushdown does not work, No ORC pushdown predicate

2015-10-23 Thread patcharee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14970786#comment-14970786 ] patcharee commented on SPARK-11087: --- [~zzhan] I found the predicate generated in

the column names removed after insert select

2015-10-23 Thread patcharee
it is supposed to be - Type: struct Any ideas how this happened and how I can fix it. Please suggest me. BR, Patcharee

the number of files after merging

2015-10-22 Thread patcharee
whole table, not one-by-one partition? Thanks, Patcharee

[jira] [Commented] (SPARK-11087) spark.sql.orc.filterPushdown does not work, No ORC pushdown predicate

2015-10-21 Thread patcharee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14967661#comment-14967661 ] patcharee commented on SPARK-11087: --- Hi [~zzhan] What version of hive and orc

[jira] [Comment Edited] (SPARK-11087) spark.sql.orc.filterPushdown does not work, No ORC pushdown predicate

2015-10-18 Thread patcharee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14960296#comment-14960296 ] patcharee edited comment on SPARK-11087 at 10/19/15 3:3

[jira] [Commented] (SPARK-11087) spark.sql.orc.filterPushdown does not work, No ORC pushdown predicate

2015-10-16 Thread patcharee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14960296#comment-14960296 ] patcharee commented on SPARK-11087: --- [~zhazhan] Below is my test. Please chec

Re: sql query orc slow

2015-10-13 Thread Patcharee Thongtra
Hi Zhan Zhang, Here is the issue https://issues.apache.org/jira/browse/SPARK-11087 BR, Patcharee On 10/13/2015 06:47 PM, Zhan Zhang wrote: Hi Patcharee, I am not sure which side is wrong, driver or executor. If it is executor side, the reason you mentioned may be possible. But if the

[jira] [Created] (SPARK-11087) spark.sql.orc.filterPushdown does not work, No ORC pushdown predicate

2015-10-13 Thread patcharee (JIRA)
patcharee created SPARK-11087: - Summary: spark.sql.orc.filterPushdown does not work, No ORC pushdown predicate Key: SPARK-11087 URL: https://issues.apache.org/jira/browse/SPARK-11087 Project: Spark

orc table with sorted field

2015-10-13 Thread Patcharee Thongtra
ddl page, it seems only bucket table can be sorted. Any suggestions please BR, Patcharee

Re: sql query orc slow

2015-10-13 Thread Patcharee Thongtra
not sorted / indexed - the split strategy hive.exec.orc.split.strategy BR, Patcharee On 10/09/2015 08:01 PM, Zhan Zhang wrote: That is weird. Unfortunately, there is no debug info available on this part. Can you please open a JIRA to add some debug information on the driver side? Thanks. Zhan

Re: sql query orc slow

2015-10-09 Thread patcharee
I set hiveContext.setConf("spark.sql.orc.filterPushdown", "true"). But from the log No ORC pushdown predicate for my query with WHERE clause. 15/10/09 19:16:01 DEBUG OrcInputFormat: No ORC pushdown predicate I did not understand what wrong with this. BR, Patcharee On

Re: sql query orc slow

2015-10-09 Thread patcharee
this time in the log pushdown predicate was generated but results was wrong (no results at all) 15/10/09 18:36:06 INFO OrcInputFormat: ORC pushdown predicate: leaf-0 = (EQUALS x 320) expr = leaf-0 Any ideas What wrong with this? Why the ORC pushdown predicate is not applied by the system? BR

Re: sql query orc slow

2015-10-08 Thread patcharee
Yes, the predicate pushdown is enabled, but still take longer time than the first method BR, Patcharee On 08. okt. 2015 18:43, Zhan Zhang wrote: Hi Patcharee, Did you enable the predicate pushdown in the second method? Thanks. Zhan Zhang On Oct 8, 2015, at 1:43 AM, patcharee wrote: Hi

sql query orc slow

2015-10-08 Thread patcharee
Hi, I am using spark sql 1.5 to query a hive table stored as partitioned orc file. We have the total files is about 6000 files and each file size is about 245MB. What is the difference between these two query methods below: 1. Using query on hive table directly hiveContext.sql("select col1,

hiveContext sql number of tasks

2015-10-07 Thread patcharee
to force the spark sql to use less tasks? BR, Patcharee - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org

Idle time between jobs

2015-09-16 Thread patcharee
.scala:143 15/09/16 11:21:08 INFO DAGScheduler: Got job 2 (saveAsTextFile at GenerateHistogram.scala:143) with 1 output partitions 15/09/16 11:21:08 INFO DAGScheduler: Final stage: ResultStage 2(saveAsTextFile at GenerateHistogram.scala:143) BR,

spark performance - executor computing time

2015-09-15 Thread patcharee
and low gc time as others. What can impact the executor computing time? Any suggestions what parameters I should monitor/configure? BR, Patcharee - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additio

spark 1.5 sort slow

2015-09-01 Thread patcharee
y configuration explicitly? Any suggestions? BR, Patcharee - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org

embedded pig in the custer

2015-07-22 Thread patcharee
. BR, Patcharee

Re: character '' not supported here

2015-07-20 Thread patcharee
data, like select count(*) from Table, any more, just got error line 1:1 character '' not supported here, no matter Tez or MR engine. How can you solve the problem in your case? BR, Patcharee On 18. juli 2015 21:26, Nitin Pawar wrote: can you tell exactly what steps you did/? al

Re: character '' not supported here

2015-07-20 Thread patcharee
data, like select count(*) from Table, any more, just got error line 1:1 character '' not supported here, no matter Tez or MR engine. How can you solve the problem in your case? BR, Patcharee On 18. juli 2015 21:26, Nitin Pawar wrote: can you tell exactly what steps you did/? al

Re: character '' not supported here

2015-07-18 Thread patcharee
This select * from table limit 5; works, but not others. So? Patcharee On 18. juli 2015 12:08, Nitin Pawar wrote: can you do select * from table limit 5; On Sat, Jul 18, 2015 at 3:35 PM, patcharee <mailto:patcharee.thong...@uni.no>> wrote: Hi, I am using hive 0.14 with T

Re: character '' not supported here

2015-07-18 Thread patcharee
This select * from table limit 5; works, but not others. So? Patcharee On 18. juli 2015 12:08, Nitin Pawar wrote: can you do select * from table limit 5; On Sat, Jul 18, 2015 at 3:35 PM, patcharee <mailto:patcharee.thong...@uni.no>> wrote: Hi, I am using hive 0.14 with T

character '' not supported here

2015-07-18 Thread patcharee
upported here line 1:137 character '' not supported here line 1:138 character '' not supported here line 1:139 character '' not supported here line 1:140 character '' not supported here line 1:141 character '' not supported here line 1:142 character '' not supported here line 1:143 character '' not supported here line 1:144 character '' not supported here line 1:145 character '' not supported here line 1:146 character '' not supported here BR, Patcharee

character '' not supported here

2015-07-18 Thread patcharee
upported here line 1:137 character '' not supported here line 1:138 character '' not supported here line 1:139 character '' not supported here line 1:140 character '' not supported here line 1:141 character '' not supported here line 1:142 character '' not supported here line 1:143 character '' not supported here line 1:144 character '' not supported here line 1:145 character '' not supported here line 1:146 character '' not supported here BR, Patcharee

Re: fails to alter table concatenate

2015-06-30 Thread patcharee
Actually it works on mr. So the problem is from tez. thanks! BR, Patcharee On 30. juni 2015 10:23, Nitin Pawar wrote: can you try doing same by changing the query engine from tez to mr1? not sure if its hive bug or tez bug On Tue, Jun 30, 2015 at 1:46 PM, patcharee <mailto:patcharee.th

fails to alter table concatenate

2015-06-30 Thread patcharee
Task BR, Patcharee

fails to alter table concatenate

2015-06-30 Thread patcharee
5) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) ] DAG failed due to vertex failure. failedVertices:1 killedVertices:0 FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.DDLTask BR, Patcharee

Re: Kryo serialization of classes in additional jars

2015-06-26 Thread patcharee
Hi, I am having this problem on spark 1.4. Do you have any ideas how to solve it? I tried to use spark.executor.extraClassPath, but it did not help BR, Patcharee On 04. mai 2015 23:47, Imran Rashid wrote: Oh, this seems like a real pain. You should file a jira, I didn't see an open

Re: HiveContext saveAsTable create wrong partition

2015-06-16 Thread patcharee
I found if I move the partitioned columns in schemaString and in Row to the end of the sequence, then it works correctly... On 16. juni 2015 11:14, patcharee wrote: Hi, I am using spark 1.4 and HiveContext to append data into a partitioned hive table. I found that the data insert into the

HiveContext saveAsTable create wrong partition

2015-06-16 Thread patcharee
") .mode(org.apache.spark.sql.SaveMode.Append).partitionBy("zone","z","year","month").saveAsTable("test4DimBySpark") --- The table contains 23 columns (longer than Tuple maximum length), so I use Row Object to store raw data, not Tupl

sql.catalyst.ScalaReflection scala.reflect.internal.MissingRequirementError

2015-06-15 Thread patcharee
hemaFor(ScalaReflection.scala:59) at org.apache.spark.sql.catalyst.ScalaReflection$.schemaFor(ScalaReflection.scala:28) at org.apache.spark.sql.SQLContext.createDataFrame(SQLContext.scala:410) at org.apache.spark.sql.SQLContext$implicits$.rddToDataFrameHold

Re: hiveContext.sql NullPointerException

2015-06-11 Thread patcharee
ot;:true,\"metadata\":{}},{\"name\":\"v\",\"type\":\"float\",\"nullable\":true,\"metadata\":{}},{\"name\":\"zone\",\"type\":\"integer\",\"nullable\":true,\"metadata\":{}}

Re: hiveContext.sql NullPointerException

2015-06-08 Thread patcharee
Hi, Thanks for your guidelines. I will try it out. Btw how do you know HiveContext.sql (and also DataFrame.registerTempTable) is only expected to be invoked on driver side? Where can I find document? BR, Patcharee On 07. juni 2015 16:40, Cheng Lian wrote: Spark SQL supports Hive dynamic

Re: hiveContext.sql NullPointerException

2015-06-07 Thread patcharee
Hi, How can I expect to work on HiveContext on the executor? If only the driver can see HiveContext, does it mean I have to collect all datasets (very large) to the driver and use HiveContext there? It will be memory overload on the driver and fail. BR, Patcharee On 07. juni 2015 11:51

hiveContext.sql NullPointerException

2015-06-06 Thread patcharee
Hi, I try to insert data into a partitioned hive table. The groupByKey is to combine dataset into a partition of the hive table. After the groupByKey, I converted the iterable[X] to DB by X.toList.toDF(). But the hiveContext.sql throws NullPointerException, see below. Any suggestions? What c

write multiple outputs by key

2015-06-06 Thread patcharee
# partitions). at foreach there are > 1000 tasks as well, but 50 tasks (same as the # all key combination) gets datasets. How can I fix this problem? Any suggestions are appreciated. BR, Patcharee - To unsubscrib

Re: FetchFailed Exception

2015-06-05 Thread patcharee
Hi, I has this problem before, and in my case it is because the executor/container was killed by yarn when it used more memory than allocated. You can check if your case is the same by checking yarn node manager log. Best, Patcharee On 05. juni 2015 07:25, ÐΞ€ρ@Ҝ (๏̯͡๏) wrote: I see this

NullPointerException SQLConf.setConf

2015-06-04 Thread patcharee
nt.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) Best, Patcharee - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org

Re: ERROR cluster.YarnScheduler: Lost executor

2015-06-03 Thread patcharee
1.3.1, is the problem from the https://issues.apache.org/jira/browse/SPARK-4516? Best, Patcharee On 03. juni 2015 10:11, Akhil Das wrote: Which version of spark? Looks like you are hitting this one https://issues.apache.org/jira/browse/SPARK-4516 Thanks Best Regards On Wed, Jun 3, 2015 at 1

Re: ERROR cluster.YarnScheduler: Lost executor

2015-06-03 Thread patcharee
943, chunkIndex=1}, buffer=FileSegmentManagedBuffer{file=/hdisk3/hadoop/yarn/local/usercache/patcharee/appcache/application_1432633634512_0213/blockmgr-12d59e6b-0895-4a0e-9d06-152d2f7ee855/09/shuffle_0_56_0.data, offset=896, length=1132499356}} to /10.10.255.238:35430; closing connect

MetaException(message:java.security.AccessControlException: Permission denied

2015-06-03 Thread patcharee
at com.sun.proxy.$Proxy37.alter_partition(Unknown Source) at org.apache.hadoop.hive.ql.metadata.Hive.alterPartition(Hive.java:469) ... 26 more BR, Patcharee - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For addit

MetaException(message:java.security.AccessControlException: Permission denied

2015-06-03 Thread patcharee
at com.sun.proxy.$Proxy37.alter_partition(Unknown Source) at org.apache.hadoop.hive.ql.metadata.Hive.alterPartition(Hive.java:469) ... 26 more BR, Patcharee

ERROR cluster.YarnScheduler: Lost executor

2015-06-02 Thread patcharee
Hi, What can be the cause of this ERROR cluster.YarnScheduler: Lost executor? How can I fix it? Best, Patcharee - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h

Insert overwrite to hive - ArrayIndexOutOfBoundsException

2015-06-02 Thread patcharee
current.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Best, Patcharee - To unsubscribe, e-mail:

Re: cast column float

2015-05-29 Thread patcharee
1.59 s OK 7.157847455.192524 7.157847455.192524 7.157847455.192524 7.157847455.192524 7.157847455.192524 Patcharee On 27. mai 2015 18:12, Bhagwan S. Soni wrote: could you also provide some sample dataset for these two columns? On Wed, May 27, 2

pig performance on reading/filtering orc file

2015-05-29 Thread patcharee
coordinate_xy by (xlong_u, xlat_u) USING 'replicated'; Best, Patcharee

cast column float

2015-05-27 Thread patcharee
records matched the condition. What can be wrong? I am using Hive 0.14 BR, Patcharee

Re: EOFException - TezJob - Cannot submit DAG

2015-05-26 Thread patcharee
org.apache.hadoop.ipc.Server$Listener$Reader.run(Server.java:607) However I am curious why the data request size 166822274 is bigger than my HDFS max block size (128 MB). Do you have an idea? BR, Patcharee On 22. mai 2015 19:58, Johannes Zillmann wrote: Hey Patcharee, i sometimes faced that in

EOFException - TezJob - Cannot submit DAG

2015-05-22 Thread patcharee
Hi, I ran a pig script on tez and got the EOFException. Check at http://wiki.apache.org/hadoop/EOFException I have no ideas at all how I can fix it. However I did not get the exception when I executed this pig script on MR. I am using HadoopVersion: 2.6.0.2.2.4.2-2, PigVersion: 0.14.0.2.2.4.

Re: conflict from apache commons codec

2015-05-20 Thread patcharee
be wrong for the latter? BR, Patcharee On 20. mai 2015 09:37, Siddharth Seth wrote: My best guess would be that an older version of commons-codec is also on the classpath for the running task. If you have access to the local-dirs configured under YARN - you could find the application dir in

saveasorcfile on partitioned orc

2015-05-20 Thread patcharee
Hi, I followed the information on https://www.mail-archive.com/reviews@spark.apache.org/msg141113.html to save orc file with spark 1.2.1. I can save data to a new orc file. I wonder how to save data to an existing and partitioned orc file? Any suggestions? BR, Patcharee

  1   2   >