from:"李铖"

hadoop.ParquetOutputCommitter: could not write summary file

2016-03-29 Thread 李铖

an error occured when write parquet files to disk. any advise? I want to know the reason.thanks ``` 16/03/29 18:31:48 WARN hadoop.ParquetOutputCommitter: could not write summary file for file:/tmp/goods/2015-6 java.lang.NullPointerException at

How to read compressed parquet file

2015-09-09 Thread 李铖

I think too many parquet files may be affect reading capability,so I use hadoop archive to combine them,but sql_context.read.parquet(output_path) does not work on the file. How to fix it ,please help me. ：）

Re: How to read compressed parquet file

2015-09-09 Thread 李铖

It works. at spark 1.4 Thanks a lot. 2015-09-09 17:21 GMT+08:00 Cheng Lian <lian.cs@gmail.com>: > You need to use "har://" instead of "hdfs://" to read HAR files. Just > tested against Spark 1.5, and it works as expected. > > Cheng > > > On

Differents in loading data using spark datasource api and using jdbc

2015-08-10 Thread 李铖

Hi,everyone. I have one question in loading data using spark datasource api and using jdbc that which way is effective?

Differents of loading data

2015-08-10 Thread 李铖

What is the differents of loading data using jdbc and loading data using spard data source api? or differents of loading data using mongo-hadoop and loading data using native java driver? Which way is better?

Re: How to increase the number of tasks

2015-06-05 Thread 李铖

Did you have a change of the value of 'spark.default.parallelism'?be a bigger number. 2015-06-05 17:56 GMT+08:00 Evo Eftimov evo.efti...@isecc.com: It may be that your system runs out of resources (ie 174 is the ceiling) due to the following 1. RDD Partition = (Spark) Task 2.

Re: How to increase the number of tasks

2015-06-05 Thread 李铖

just multiply 2-4 with the cpu core number of the node . 2015-06-05 18:04 GMT+08:00 ÐΞ€ρ@Ҝ (๏̯͡๏) deepuj...@gmail.com: I did not change spark.default.parallelism, What is recommended value for it. On Fri, Jun 5, 2015 at 3:31 PM, 李铖 lidali...@gmail.com wrote: Did you have a change

'Java heap space' error occured when query 4G data file from HDFS

2015-04-07 Thread 李铖

In my dev-test env .I have 3 virtual machines ,every machine have 12G memory,8 cpu core. Here is spark-defaults.conf,and spark-env.sh.Maybe some config is not right. I run this command :*spark-submit --master yarn-client --driver-memory 7g --executor-memory 6g /home/hadoop/spark/main.py*

Re: 'Java heap space' error occured when query 4G data file from HDFS

2015-04-07 Thread 李铖

Any help?please. Help me do a right configure. 李铖 lidali...@gmail.com于2015年4月7日星期二写道： In my dev-test env .I have 3 virtual machines ,every machine have 12G memory,8 cpu core. Here is spark-defaults.conf,and spark-env.sh.Maybe some config is not right. I run this command :*spark-submit

Missing an output location for shuffle. : (

2015-03-26 Thread 李铖

Again,when I do larger file Spark-sql query, error occured.Anyone have got fix it .Please help me. Here is the track. org.apache.spark.shuffle.MetadataFetchFailedException: Missing an output location for shuffle 0 at

Re: Missing an output location for shuffle. : (

2015-03-26 Thread 李铖

172.100.11.25080:71:7a:95:48:a21364021秒 2015-03-26 23:01 GMT+08:00 Michael Armbrust mich...@databricks.com: I would suggest looking for errors in the logs of your executors. On Thu, Mar 26, 2015 at 3:20 AM, 李铖 lidali...@gmail.com wrote: Again,when I do larger file Spark-sql query, error

Spark-sql query got exception.Help

2015-03-25 Thread 李铖

It is ok when I do query data from a small hdfs file. But if the hdfs file is 152m,I got this exception. I try this code .'sc.setSystemProperty(spark.kryoserializer.buffer.mb,'256')'.error still. ``` com.esotericsoftware.kryo.KryoException: Buffer overflow. Available: 0, required: 39135 at

Re: Spark-sql query got exception.Help

2015-03-25 Thread 李铖

the 2nd one is larger (it seems that Kryo doesn’t check for it). Cheng On 3/25/15 7:31 PM, 李铖 wrote: Here is the full track 15/03/25 17:48:34 WARN TaskSetManager: Lost task 0.0 in stage 1.0 (TID 1, cloud1): com.esotericsoftware.kryo.KryoException: Buffer overflow. Available: 0, required

Re: Spark-sql query got exception.Help

2015-03-25 Thread 李铖

the full stack trace? On 3/25/15 6:26 PM, 李铖 wrote: It is ok when I do query data from a small hdfs file. But if the hdfs file is 152m,I got this exception. I try this code .'sc.setSystemProperty(spark.kryoserializer.buffer.mb,'256')'.error still. ``` com.esotericsoftware.kryo.KryoException

Re: Spark-sql query got exception.Help

2015-03-25 Thread 李铖

$WriterThread.run(PythonRDD.scala:203) 2015-03-26 10:39 GMT+08:00 李铖 lidali...@gmail.com: Yes, it works after I append the two properties in spark-defaults.conf. As I use python programing on spark platform,the python api does not have SparkConf api. Thanks. 2015-03-25 21:07 GMT+08:00 Cheng Lian

Should I do spark-sql query on HDFS or apache hive?

2015-03-17 Thread 李铖

Hi,everybody. I am new in spark. Now I want to do interactive sql query using spark sql. spark sql can run under hive or loading files from hdfs. Which is better or faster? Thanks.

Should I do spark-sql query on HDFS or hive?

2015-03-17 Thread 李铖

Hi,everybody. I am new in spark. Now I want to do interactive sql query using spark sql. spark sql can run under hive or loading files from hdfs. Which is better or faster? Thanks.

Re: Should I do spark-sql query on HDFS or apache hive?

2015-03-17 Thread 李铖

. Even hive tables are read from files HDFS usually. You probably should use HiveContext as its query language is more powerful than SQLContext. Also, parquet is usually the faster data format for Spark SQL. On Tue, Mar 17, 2015 at 3:41 AM, 李铖 lidali...@gmail.com wrote: Hi,everybody. I am

hadoop.ParquetOutputCommitter: could not write summary file

How to read compressed parquet file

Re: How to read compressed parquet file

Differents in loading data using spark datasource api and using jdbc

Differents of loading data

Re: How to increase the number of tasks

Re: How to increase the number of tasks

'Java heap space' error occured when query 4G data file from HDFS

Re: 'Java heap space' error occured when query 4G data file from HDFS

Missing an output location for shuffle. : (

Re: Missing an output location for shuffle. : (

Spark-sql query got exception.Help

Re: Spark-sql query got exception.Help

Re: Spark-sql query got exception.Help

Re: Spark-sql query got exception.Help

Should I do spark-sql query on HDFS or apache hive?

Should I do spark-sql query on HDFS or hive?

Re: Should I do spark-sql query on HDFS or apache hive?

18 matches

Site Navigation

Mail list logo

Footer information