Is there a way to create multiple streams in spark streaming?

2015-10-20 Thread LinQili
Hi all,I wonder if there is a way to create some child streaming while using spark streaming?For example, I create a netcat main stream, read data from a socket, then create 3 different child streams on the main stream,in stream1, we do fun1 on the input data then print result to screen;in

Spark Sql: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient

2015-04-28 Thread LinQili
Hi all. I was launching a spark sql job on my own machine, not on the spark cluster machines, and failed. The excpetion info is: 15/04/28 16:28:04 INFO yarn.ApplicationMaster: Final app status: FAILED, exitCode: 15, (reason: User class threw exception: java.lang.RuntimeException: Unable to

Re: Exception while select into table.

2015-03-03 Thread LinQili
(HiveConf.ConfVars.HIVE_HADOOP_SUPPORTS_SUBDIRECTORIES) item.isDir()) { throw new HiveException(checkPaths: + src.getPath() + has nested directory + itemSource); } | On 3/3/15 14:36, LinQili wrote: Hi all, I was doing select using spark sql

Exception while select into table.

2015-03-02 Thread LinQili
Hi all,I was doing select using spark sql like: insert into table startup_log_uid_20150227select * from bak_startup_log_uid_20150227where login_time 1425027600 Usually, it got a exception:

Is there a way to delete hdfs file/directory using spark API?

2015-01-21 Thread LinQili
Hi, allI wonder how to delete hdfs file/directory using spark API?

how to select the first row in each group by group?

2015-01-12 Thread LinQili
Hi all:I am using spark sql to read and write hive tables. But There is a issue that how to select the first row in each group by group?In hive, we could write hql like this:SELECT imeiFROM (SELECT imei, row_number() over (PARTITION BY imei ORDER BY login_time

How to export data from hive into hdfs in spark program?

2014-12-23 Thread LinQili
Hi all:I wonder if is there a way to export data from table of hive into hdfs using spark?like this: INSERT OVERWRITE DIRECTORY '/user/linqili/tmp/src' select * from $DB.$tableName

Can we specify driver running on a specific machine of the cluster on yarn-cluster mode?

2014-12-18 Thread LinQili
Hi all,On yarn-cluster mode, can we let the driver running on a specific machine that we choose in cluster ? Or, even the machine not in the cluster?

RE: Issues on schemaRDD's function got stuck

2014-12-09 Thread LinQili
I checked my code again, and located the issue that, if we do the `load data inpath` before select statement, the application will get stuck, if don't, it won't get stuck.Log info: 14/12/09 17:29:33 ERROR actor.ActorSystemImpl: Uncaught fatal error from thread

RE: Issues on schemaRDD's function got stuck

2014-12-09 Thread LinQili
I checked my code again, and located the issue that, if we do the `load data inpath` before select statement, the application will get stuck, if don't, it won't get stuck.Get stuck code: val sqlLoadData = sLOAD DATA INPATH '$currentFile' OVERWRITE INTO TABLE $tableName

Issues on schemaRDD's function got stuck

2014-12-08 Thread LinQili
Hi all:I was running HiveFromSpark on yarn-cluster. While I got the hive select's result schemaRDD and tried to run `collect()` on it, the application got stuck and don't know what's wrong with it. Here is my code: val sqlStat = sSELECT * FROM $TABLE_NAME val result = hiveContext.hql(sqlStat)

Issue on [SPARK-3877][YARN]: Return code of the spark-submit in yarn-cluster mode

2014-12-05 Thread LinQili
Hi, all: According to https://github.com/apache/spark/pull/2732, When a spark job fails or exits nonzero in yarn-cluster mode, the spark-submit will get the corresponding return code of the spark job. But I tried in spark-1.1.1 yarn cluster, spark-submit return zero anyway. Here is my spark

RE: Issue on [SPARK-3877][YARN]: Return code of the spark-submit in yarn-cluster mode

2014-12-05 Thread LinQili
I tried in spark client mode, spark-submit can get the correct return code from spark job. But in yarn-cluster mode, It failed. From: lin_q...@outlook.com To: u...@spark.incubator.apache.org Subject: Issue on [SPARK-3877][YARN]: Return code of the spark-submit in yarn-cluster mode Date: Fri, 5

RE: Issue on [SPARK-3877][YARN]: Return code of the spark-submit in yarn-cluster mode

2014-12-05 Thread LinQili
I tried anather test code: def main(args: Array[String]) {if (args.length != 1) { Util.printLog(ERROR, Args error - arg1: BASE_DIR) exit(101) }val currentFile = args(0).toStringval DB = test_spark val tableName = src val sparkConf = new

Issues about running on client in standalone mode

2014-11-24 Thread LinQili
Hi all:I deployed a spark client in my own machine. I put SPARK in path:` /home/somebody/spark`, and the cluster's worker spark home path is `/home/spark/spark` .While I launched the jar, it shows that: ` AppClient$ClientActor: Executor updated: app-20141124170955-11088/12 is now FAILED