from:"Rex Xiong"

Range partition for parquet file?

2016-05-27 Thread Rex Xiong

Hi, I have a spark job output DataFrame which contains a column named Id, which is a GUID string. We will use Id to filter data in another spark application, so it should be a partition key. I found these two methods in Internet: 1. DataFrame.write.save("Id") method will help, but the possible

Re: Issue of Hive parquet partitioned table schema mismatch

2015-11-06 Thread Rex Xiong

...@gmail.com>: > Is there any chance that " spark.sql.hive.convertMetastoreParquet" is > turned off? > > Cheng > > On 11/4/15 5:15 PM, Rex Xiong wrote: > > Thanks Cheng Lian. > I found in 1.5, if I use spark to create this table with partition > discovery

Re: Issue of Hive parquet partitioned table schema mismatch

2015-11-03 Thread Rex Xiong

31日下午7:38，"Rex Xiong" <bycha...@gmail.com>写道： > Add back this thread to email list, forgot to reply all. > 2015年10月31日下午7:23，"Michael Armbrust" <mich...@databricks.com>写道： > >> Not that I know of. >> >> On Sat, Oct 31, 2015 at 12:22 PM

Re: Issue of Hive parquet partitioned table schema mismatch

2015-10-31 Thread Rex Xiong

Add back this thread to email list, forgot to reply all. 2015年10月31日下午7:23，"Michael Armbrust" <mich...@databricks.com>写道： > Not that I know of. > > On Sat, Oct 31, 2015 at 12:22 PM, Rex Xiong <bycha...@gmail.com> wrote: > >> Good to know that, wil

Issue of Hive parquet partitioned table schema mismatch

2015-10-30 Thread Rex Xiong

Hi folks, I have a Hive external table with partitions. Every day, an App will generate a new partition day=-MM-dd stored by parquet and run add-partition Hive command. In some cases, we will add additional column to new partitions and update Hive table schema, then a query across new and old

Issue of jar dependency in yarn-cluster mode

2015-10-16 Thread Rex Xiong

Hi folks, In my spark application, executor task depends on snakeyaml-1.10.jar I build it with Maven and it works fine: spark-submit --master local --jars d:\snakeyaml-1.10.jar ... But when I try to run it in yarn, I have issue, it seems spark executor cannot find the jar file:

Re: Issue of jar dependency in yarn-cluster mode

2015-10-16 Thread Rex Xiong

I resolve this issue finally by adding --conf spark.executor.extraClassPath= snakeyaml-1.10.jar 2015-10-16 22:57 GMT+08:00 Rex Xiong <bycha...@gmail.com>: > Hi folks, > > In my spark application, executor task depends on snakeyaml-1.10.jar > I build it with Maven and it w

Jar is cached in yarn-cluster mode?

2015-10-09 Thread Rex Xiong

I use "spark-submit -master yarn-cluster hdfs://.../a.jar .." to submit my app to yarn. Then I update this a.jar in HDFS, run the command again, I found a line of log that was been removed still exist in "yarn logs ". Is there a cache mechanism I need to disable? Thanks

Is it possible to disable AM page proxy in Yarn client mode?

2015-08-03 Thread Rex Xiong

In Yarn client mode, Spark driver URL will be redirected to Yarn web proxy server, but I don't want to use this dynamic name, is it possible to still use host:port as standalone mode?

DESCRIBE FORMATTED doesn't work in Hive Thrift Server?

2015-07-05 Thread Rex Xiong

Hi, I try to use for one table created in spark, but it seems the results are all empty, I want to get metadata for table, what's other options? Thanks +---+ |result | +---+ | # col_name| |

How to get Master UI with ZooKeeper HA setup?

2015-05-11 Thread Rex Xiong

Hi, We have a 3-node master setup with ZooKeeper HA. Driver can find the master with spark://xxx:xxx,xxx:xxx,xxx:xxx But how can I find out the valid Master UI without looping through all 3 nodes? Thanks

Re: Parquet Hive table become very slow on 1.3?

2015-04-22 Thread Rex Xiong

On Tue, Apr 21, 2015 at 1:13 AM, Rex Xiong bycha...@gmail.com wrote: We have the similar issue with massive parquet files, Cheng Lian, could you have a look? 2015-04-08 15:47 GMT+08:00 Zheng, Xudong dong...@gmail.com: Hi Cheng, I tried both these patches, and seems still not resolve my

Re: Parquet Hive table become very slow on 1.3?

2015-04-21 Thread Rex Xiong

We have the similar issue with massive parquet files, Cheng Lian, could you have a look? 2015-04-08 15:47 GMT+08:00 Zheng, Xudong dong...@gmail.com: Hi Cheng, I tried both these patches, and seems still not resolve my issue. And I found the most time is spend on this line in

Issue of sqlContext.createExternalTable with parquet partition discovery after changing folder structure

2015-04-04 Thread Rex Xiong

Hi Spark Users, I'm testing 1.3 new feature of parquet partition discovery. I have 2 sub folders, each has 800 rows. /data/table1/key=1 /data/table1/key=2 In spark-shell, run this command: val t = sqlContext.createExternalTable(table1, hdfs:///data/table1, parquet) t.count It shows 1600

Parquet timestamp support for Hive?

2015-04-03 Thread Rex Xiong

Hi, I got this error when creating a hive table from parquet file: DDLTask: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.UnsupportedOperationException: Parquet does not support timestamp. See HIVE-6384 I check HIVE-6384, it's fixed in 0.14. The hive in spark build is a customized

Return jobid for a hive query?

2015-03-03 Thread Rex Xiong

Hi there, I have an app talking to Spark Hive Server using Hive ODBC, querying is OK. But in this interface, I can't get much running details when my query goes wrong, only one error message is shown. I want to get jobid for my query, so that I can go to Application Detail UI to see what's going

Range partition for parquet file?

Re: Issue of Hive parquet partitioned table schema mismatch

Re: Issue of Hive parquet partitioned table schema mismatch

Re: Issue of Hive parquet partitioned table schema mismatch

Issue of Hive parquet partitioned table schema mismatch

Issue of jar dependency in yarn-cluster mode

Re: Issue of jar dependency in yarn-cluster mode

Jar is cached in yarn-cluster mode?

Is it possible to disable AM page proxy in Yarn client mode?

DESCRIBE FORMATTED doesn't work in Hive Thrift Server?

How to get Master UI with ZooKeeper HA setup?

Re: Parquet Hive table become very slow on 1.3?

Re: Parquet Hive table become very slow on 1.3?

Issue of sqlContext.createExternalTable with parquet partition discovery after changing folder structure

Parquet timestamp support for Hive?

Return jobid for a hive query?

16 matches

Site Navigation

Mail list logo

Footer information