017 at 3:39 AM, Raju Bairishetti <r...@apache.org> wrote:
> @Eli, Thanks for the suggestion. If you do not mind can you please
> elaborate approaches?
>
> On Mon, Mar 6, 2017 at 7:29 PM, Eli Super <eli.su...@gmail.com> wrote:
>
>> Hi
>>
>> Try to impleme
Hi
Try to implement binning and/or feature engineering (smart feature
selection for example)
Good luck
On Mon, Mar 6, 2017 at 6:56 AM, Raju Bairishetti wrote:
> Hi,
> I am new to Spark ML Lib. I am using FPGrowth model for finding related
> items.
>
> Number of transactions
Hi
I have a windows laptop
I just downloaded the spark 1.4.1 source code.
I try to compile *org.apache.spark.mllib.fpm* with *mvn *
My goal is to replace *original *org\apache\spark\mllib\fpm\* in
*spark-assembly-1.4.1-hadoop2.6.0.jar*
As I understand from this link
Hi Spark Users,
I need your help.
I've some output after running DecisionTree :
I work with Jupyter notebook and python 2.7
How I can create a graphical representation of the Decision Tree model ?
In sklearn I can use tree.export_graphviz , in R I can see the Decision
Tree output as well .
Hi
I work with pyspark & spark 1.5.2
Currently saving rdd into csv file is very very slow , uses 2% CPU only
I use :
my_dd.write.format("com.databricks.spark.csv").option("header",
"false").save('file:///my_folder')
Is there a way to save csv faster ?
Many thanks
Hi
I'm running spark locally on win 2012 R2 server
No hadoop installed
I'm getting following error :
*WARN ZlibFactory: Failed to load/initialize native-zlib library*
*Is it something to wary about ?*
Thanks !
"Bucketizer transforms a column of continuous features to a
> column of feature buckets, where the buckets are specified by users."
>
> [1]: http://spark.apache.org/docs/latest/ml-features.html#bucketizer
>
> On Mon, Jan 25, 2016 at 5:34 AM, Eli Super <eli.su...@gmail.com&
Hi
I try to select all values but not NULL values from column contains NULL
values
with
sqlContext.sql("select my_column from my_table where my_column <> null
").show(15)
or
sqlContext.sql("select my_column from my_table where my_column != null
").show(15)
I get empty result
Thanks !
Hi
What is a best way to discretize Continuous Variable within Spark
DataFrames ?
I want to discretize some variable 1) by equal frequency 2) by k-means
I usually use R for this porpoises
_http://www.inside-r.org/packages/cran/arules/docs/discretize
R code for example :
### equal frequency
t;
>
> And yeah that looks like a Python – I’m not hot with Python but it may be
> capitalised as False or FALSE?
>
>
>
>
>
> *From:* Eli Super [mailto:eli.su...@gmail.com]
> *Sent:* 21 January 2016 14:48
> *To:* Spencer, Alex (Santander)
> *Cc:* user@spark.apa
Hi
I try to save parts of large table as csv files
I use following commands :
sqlContext.sql("select * from my_table where trans_time between '2015/12/18
12:00' and '2015/12/18
12:06'").write.format("com.databricks.spark.csv").option("header",
"false").save('00_06')
and
sqlContext.sql("select
Hi
I have a large size parquet file .
I need to cast the whole column to timestamp format , then save
What the right way to do it ?
Thanks a lot
(int numRows,
>
> boolean truncate)
>
>
>
>
>
> Kind Regards,
>
> Alex.
>
>
>
> *From:* Eli Super [mailto:eli.su...@gmail.com]
> *Sent:* 14 January 2016 13:09
> *To:* user@spark.apache.org
> *Subject:* Spark SQL . How to enlarge output rows ?
>
just build local spark only with csv package
and thrift server , what hadoop version to use to avoid warnings ?
Thanks a lot !
On Thu, Jan 21, 2016 at 9:08 AM, Eli Super <eli.su...@gmail.com> wrote:
> Hi
>
> I get WARNINGS when try to build spark 1.6.0
>
> overall I get S
Hi
I get WARNINGS when try to build spark 1.6.0
overall I get SUCCESS message on all projects
command I used :
mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.0 -Dscala-2.10 -Phive
-Phive-thriftserver -DskipTests clean package
from pom.xml
2.10.5
2.10
example of warnings :
[INFO]
Hi
After executing sql
sqlContext.sql("select day_time from my_table limit 10").show()
my output looks like :
++
| day_time|
++
|2015/12/15 15:52:...|
|2015/12/15 15:53:...|
|2015/12/15 15:52:...|
|2015/12/15 15:52:...|
|2015/12/15 15:52:...|
16 matches
Mail list logo