number of partitions for hive schemaRDD

2015-02-26 Thread masaki rikitoku
Hi all

now, I'm trying the SparkSQL with hivecontext.

when I execute the hql like the following.

---

val ctx = new org.apache.spark.sql.hive.HiveContext(sc)
import ctx._

val queries = ctx.hql(select keyword from queries where dt =
'2015-02-01' limit 1000)

---

It seem that the number of the partitions ot the queries is set by 1.

Is this the specifications for schemaRDD, SparkSQL, HiveContext ?

Are there any means to set the number of partitions arbitrary value
except for explicit repartition


Masaki Rikitoku

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: IDF for ml pipeline

2015-02-03 Thread masaki rikitoku
Thank you for your reply. I will do it.



—
Mailbox から送信

On Tue, Feb 3, 2015 at 6:12 PM, Xiangrui Meng men...@gmail.com wrote:

 Yes, we need a wrapper under spark.ml. Feel free to create a JIRA for
 it. -Xiangrui
 On Mon, Feb 2, 2015 at 8:56 PM, masaki rikitoku rikima3...@gmail.com wrote:
 Hi all

 I am trying the ml pipeline for text classfication now.

 recently, i succeed to execute the pipeline processing in ml packages,
 which consist of the original Japanese tokenizer, hashingTF,
 logisticRegression.

 then,  i failed to  executed the pipeline with idf in mllib package directly.

 To use the idf feature in ml package,
 do i have to implement the wrapper for idf in ml package like the hashingTF?

 best

 Masaki Rikitoku

 -
 To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
 For additional commands, e-mail: dev-h...@spark.apache.org


IDF for ml pipeline

2015-02-02 Thread masaki rikitoku
Hi all

I am trying the ml pipeline for text classfication now.

recently, i succeed to execute the pipeline processing in ml packages,
which consist of the original Japanese tokenizer, hashingTF,
logisticRegression.

then,  i failed to  executed the pipeline with idf in mllib package directly.

To use the idf feature in ml package,
do i have to implement the wrapper for idf in ml package like the hashingTF?

best

Masaki Rikitoku

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org