Hi All,
I need to print auc and prc for GBTClassifier model, it seems okay for
RandomForestClassifier but not GBTClassifier, though rawPrediction column is
neither in original data.
the codes are :
.. // Set up Pipeline val stages
= new
imator(pipeline).setEvaluator(new
RegressionEvaluator).setEstimatorParamMaps(paramGrid) val cvModel =
cv.fit(data) val plmodel = cvModel.bestModel.asInstanceOf[PipelineModel]
val lrModel = plmodel.stages(0).asInstanceOf[LinearRegressionModel]
On 24 November 2016 at 10:23, Zhiliang Zhu &
scala codes are also for me, if there is some solution .
On Friday, November 25, 2016 1:27 AM, Zhiliang Zhu
<zchl.j...@yahoo.com.INVALID> wrote:
Hi All,
Here want to print the specific tree or forest structure from pipeline model.
However, it seems that here met more issue
Hi All,
Here want to print the specific tree or forest structure from pipeline model.
However, it seems that here met more issue about XXXClassifier and
XXXClassificationModel,
as the codes below:
... GBTClassifier gbtModel = new GBTClassifier(); ParamMap[]
grid = new
, November 24, 2016 2:15 AM, Xiaomeng Wan <shawn...@gmail.com>
wrote:
You can use pipelinemodel.stages(0).asInstanceOf[RandomForestModel]. The
number (0 in example) for stages depends on the order you call setStages.
Shawn
On 23 November 2016 at 10:21, Zhiliang Zhu <zchl.j...@yahoo.com.inval
Dear All,
I am building model by spark pipeline, and in the pipeline I used Random Forest
Alg as its stage.
If I just use Random Forest but not make it by way of pipeline, I could see the
information about the forest by API as
rfModel.toDebugString() and rfModel.toString() .
However, while it
Hi All,
Here I have lot of data with around 1,000,000 rows, 97% of them are negative
class and 3% of them are positive class . I applied Random Forest algorithm to
build the model and predict the testing data.
For the data preparation,i. firstly randomly split all the data as training
data
static dataset small
enough to work, and editing the query, then retesting, repeatedly until you cut
the execution time by a significant fraction- Using the Spark UI or spark shell
to check the skew and make sure partitions are evenly distributed
On Jul 18, 2016, at 3:33 AM, Zhiliang Zhu &
or clue is also good.
Thanks in advance~
On Tuesday, July 19, 2016 11:05 AM, Zhiliang Zhu <zchl.j...@yahoo.com> wrote:
Show original message
Hi Mungeol,
Thanks a lot for your help. I will try that.
On Tuesday, July 19, 2016 9:21 AM, Mungeol Heo <mungeol@gmail.com>
wr
try to set --drive-memory xg , x would be as large as can be set .
On Monday, July 18, 2016 6:31 PM, Saurav Sinha
wrote:
Hi,
I am running spark job.
Master memory - 5Gexecutor memort 10G(running on 4 node)
My job is getting killed as no of partition increase
ecause It's complex you can use something
like EXPLAIN command to show what going on.
On Jul 18, 2016, at 5:20 PM, Zhiliang Zhu <zchl.j...@yahoo.com.INVALID> wrote:
the sql logic in the program is very much complex , so do not describe the
detailed codes here .
On Monday, Jul
the sql logic in the program is very much complex , so do not describe the
detailed codes here .
On Monday, July 18, 2016 6:04 PM, Zhiliang Zhu
<zchl.j...@yahoo.com.INVALID> wrote:
Hi All,
Here we have one application, it needs to extract different columns from 6 hive
Hi All,
Here we have one application, it needs to extract different columns from 6 hive
tables, and then does some easy calculation, there is around 100,000 number of
rows in each table,finally need to output another table or file (with format of
consistent columns) .
However, after lots of
ory for example, in particular
spark.yarn.executor.memoryOverhead
Everything else you mention is a symptom of YARN shutting down your
jobs because your memory settings don't match what your app does.
They're not problems per se, based on what you have provided.
On Mon, Jun 20, 2016 at 9:17 AM, Zhili
huffle
operation? --WBR, Alexander From: Zhiliang Zhu
Sent: 17 июня 2016 г. 14:10
To: User; kp...@hotmail.com
Subject: Re: spark job automatically killed without rhyme or reason
Show original message
Hi Alexander,
is your yarn userlog just for the executor log ?
as for those logs seem a
currently ...
Thank you in advance~
On Friday, June 17, 2016 6:53 PM, Zhiliang Zhu <zchl.j...@yahoo.com> wrote:
Hi Alexander,
Thanks a lot for your reply.
Yes, submitted by yarn.Do you just mean in the executor log file by way of yarn
logs -applicationId id,
in this file
heck yarn userlogs for more information… --WBR, Alexander
From: Zhiliang Zhu
Sent: 17 июня 2016 г. 9:36
To: Zhiliang Zhu; User
Subject: Re: spark job automatically killed without rhyme or reason anyone
ever met the similar problem, which is quite strange ...
On Friday, June 17, 2016 2:1
in advance~
On Friday, June 17, 2016 6:53 PM, Zhiliang Zhu <zchl.j...@yahoo.com> wrote:
Hi Alexander,
Thanks a lot for your reply.
Yes, submitted by yarn.Do you just mean in the executor log file by way of yarn
logs -applicationId id,
in this file, both in some containers'
tasks are executed. In this
situation, please check yarn userlogs for more information… --WBR, Alexander
From: Zhiliang Zhu
Sent: 17 июня 2016 г. 9:36
To: Zhiliang Zhu; User
Subject: Re: spark job automatically killed without rhyme or reason anyone
ever met the similar problem, whic
anyone ever met the similar problem, which is quite strange ...
On Friday, June 17, 2016 2:13 PM, Zhiliang Zhu
<zchl.j...@yahoo.com.INVALID> wrote:
Hi All,
I have a big job which mainly takes more than one hour to run the whole,
however, it is very much unreasonable to exit &a
Hi All,
I have a big job which mainly takes more than one hour to run the whole,
however, it is very much unreasonable to exit & finish to run midway (almost
80% of the job finished actually, but not all), without any apparent error or
exception log.
I submitted the same job for many times, it
just for test, since it seemed that the user email system was something wrong
ago, is okay now.
On Friday, June 17, 2016 12:18 PM, Zhiliang Zhu
<zchl.j...@yahoo.com.INVALID> wrote:
On Tuesday, May 17, 2016 10:44 AM, Zhiliang Zhu
<zchl.j...@yahoo.com.INVALID> wrote:
On Tuesday, May 17, 2016 10:44 AM, Zhiliang Zhu
<zchl.j...@yahoo.com.INVALID> wrote:
Hi All,
For the given DataFrame created by hive sql, however, then it is required to
add one more column based on the existing column, and should also keep the
previous columns there for the
Hi All,
For the given DataFrame created by hive sql, however, then it is required to
add one more column based on the existing column, and should also keep the
previous columns there for the result DataFrame.
final double DAYS_30 = 1000 * 60 * 60 * 24 * 30.0;
//DAYS_30 seems difficult to call
Hi All,
For the given DataFrame created by hive sql, however, then it is required to
add one more column based on the existing column, and should also keep the
previous columns there for the result DataFrame.
final double DAYS_30 = 1000 * 60 * 60 * 24 * 30.0;
//DAYS_30 seems difficult to call
For some file on hdfs, it is necessary to copy/move it to some another specific
hdfs directory, and the directory name would keep unchanged.Just need finish
it in spark program, but not hdfs commands.Is there any codes, it seems not to
be done by searching spark doc ...
Thanks in advance!
In order to make job run faster, some parameters would be specified in the
command lines, such as --executor-cores , --executor-memory and --num-executors
...
However, as tested, it seemed that those numbers would not be reset randomly,
or some trouble would be caused for the cluster.What is
based on subset of the rows in rdd0 ?That way you can
increase the parallelism.
Cheers
On Mon, Dec 21, 2015 at 9:40 AM, Zhiliang Zhu <zchl.j...@yahoo.com> wrote:
Hi Ted,
Thanks a lot for your kind reply.
I needs to convert this rdd0 into another rdd1, rows of rdd1 are generated
fro
cases? If there is no shuffle, you can
collapse all these functions into one, right? In the meantime, it is not
recommended to collectall data to driver.
Thanks.
Zhan Zhang
On Dec 21, 2015, at 3:44 AM, Zhiliang Zhu <zchl.j...@yahoo.com.INVALID> wrote:
Dear All,
I need to iterator some job / rdd quite a
cases? If there is no shuffle, you can
collapse all these functions into one, right? In the meantime, it is not
recommended to collectall data to driver.
Thanks.
Zhan Zhang
On Dec 21, 2015, at 3:44 AM, Zhiliang Zhu <zchl.j...@yahoo.com.INVALID> wrote:
Dear All,
I need to iterator some job
Dear All,
For some rdd, while there is just one partition, then the operation &
arithmetic would only be single, the rdd has lose all the parallelism benefit
from spark system ...
Is it exactly like that?
Thanks very much in advance!Zhiliang
owever, as
tested, it seemed that checkpoint is more costlythan collect ...
Hopefully you are using the Kryo serializer already.
This would be all right. From your experience , is Kryo improve efficiency
obviously ...
RegardsSab
On Mon, Dec 21, 2015 at 5:51 PM, Zhiliang Zhu <zchl.j...@yahoo.co
result depend on last
iteration? If so, how do they depend on?I think either you can optimize your
implementation, or Spark is not the right one for your specific application.
Thanks.
Zhan Zhang
On Dec 21, 2015, at 10:43 AM, Zhiliang Zhu <zchl.j...@yahoo.com.INVALID> wrote:
What is differ
Dear All,
I need to iterator some job / rdd quite a lot of times, but just lost in the
problem of spark only accept to call around 350 number of map before it meets
one action Function , besides, dozens of action will obviously increase the run
time.Is there any proper way ...
As tested, there
false)(implicit ord: Ordering[T] = null)
Cheers
On Mon, Dec 21, 2015 at 2:47 AM, Zhiliang Zhu <zchl.j...@yahoo.com.invalid>
wrote:
Dear All,
For some rdd, while there is just one partition, then the operation &
arithmetic would only be single, the rdd has lose all the parallelism ben
use matrix SVD decomposition and spark has the lib .
http://spark.apache.org/docs/latest/mllib-dimensionality-reduction.html#singular-value-decomposition-svd
On Thursday, December 10, 2015 7:33 PM, Arunkumar Pillai
wrote:
Hi
I need to find inverse
ious order among the elements,
and will it also not work ?
Thanks very much in advance!
On Monday, December 7, 2015 11:32 AM, Zhiliang Zhu <zchl.j...@yahoo.com>
wrote:
On Monday, December 7, 2015 10:37 AM, DB Tsai <dbt...@dbtsai.com> wrote:
Only beginning a
Hi All,
I need to do optimize objective function with some linear constraints by
genetic algorithm. I would like to make as much parallelism for it by spark.
repartition / shuffle may be used sometimes in it, however, is repartition API
very cost ?
Thanks in advance!Zhiliang
threading engine). In general you need to do
performance testing to see if a repartition is worth the shuffle time. A
common model is to repartition the data once after ingest to achieve
parallelism and avoid shuffles whenever possible later. From: Zhiliang Zhu
[mailto:zchl.j...@yahoo.c
https://www.dbtsai.com
PGP Key ID: 0xAF08DF8D
On Fri, Dec 4, 2015 at 10:30 PM, Zhiliang Zhu <zchl.j...@yahoo.com> wrote:
> Hi All,
>
> I would like to compare any two adjacent elements in one given rdd, just as
> the single machine code part:
>
> int a[N] = {...};
> for (int i=0;
d API , or repartition ?
Thanks a lot in advance!
Sincerely,
DB Tsai
--
Web: https://www.dbtsai.com
PGP Key ID: 0xAF08DF8D
On Sun, Dec 6, 2015 at 6:27 PM, Zhiliang Zhu <zchl.j...@yahoo.com> wrote:
>
>
>
>
> On Saturday, Decemb
On Saturday, December 5, 2015 3:52 PM, Zhiliang Zhu
<zchl.j...@yahoo.com.INVALID> wrote:
Hi DB Tsai,
Thanks very much for your kind reply!
Sorry that for one more issue, as tested it seems that filter could only return
JavaRDD but not any JavaRDD , is it ?Then it is not much convenient
Hi All,
I would like to compare any two adjacent elements in one given rdd, just as the
single machine code part:
int a[N] = {...};for (int i=0; i < N - 1; ++i) { compareFun(a[i], a[i+1]);}...
mapPartitions may work for some situations, however, it could not compare
elements in different
cerely,
DB Tsai
--
Web: https://www.dbtsai.com
PGP Key ID: 0xAF08DF8D
On Fri, Dec 4, 2015 at 10:30 PM, Zhiliang Zhu <zchl.j...@yahoo.com> wrote:
> Hi All,
>
> I would like to compare any two adjacent elements in one given rdd,
Hi all,
I have some optimization problem, I have googled a lot but still did not get
the exact algorithm or third-party open package to apply in it.
Its type is like this,
Objective function: f(x1, x2, ..., xn) (n >= 100, and f may be linear or
non-linear)Constraint functions:
x1 + x2 + ... +
On Thursday, November 19, 2015 1:46 PM, Ted Yu <yuzhih...@gmail.com> wrote:
Have you looked athttps://github.com/scalanlp/breeze/wiki
Cheers
On Nov 18, 2015, at 9:34 PM, Zhiliang Zhu <zchl.j...@yahoo.com> wrote:
Dear Jack,
As is known, Breeze is numerical calculation package wr
Dear Jack,
As is known, Breeze is numerical calculation package wrote by scala , spark
mllib also use it as underlying package for algebra usage.Here I am also
preparing to use Breeze for nonlinear equation optimization, however, it seemed
that I could not find the exact doc or API for Breeze
(Array(n_f))
val n_linesRDD = n_lines.map(n => { //Read and return 5 lines (n._1) from
the file (n._2)
})
ThanksBest Regards
On Thu, Oct 29, 2015 at 9:51 PM, Zhiliang Zhu <zchl.j...@yahoo.com.invalid>
wrote:
Hi All,
There is some file with line number N + M,, as I need to r
while new the Function obj, and in the Function inner class the inner
normal function can be called.
On Tuesday, November 10, 2015 5:12 PM, Zhiliang Zhu <zchl.j...@yahoo.com>
wrote:
As more test, the Function call by map/sortBy etc must be defined as static,
or it can be d
As more test, the Function call by map/sortBy etc must be defined as static, or
it can be defined as non-static and must be called by other static normal
function.I am really confused by it.
On Tuesday, November 10, 2015 4:12 PM, Zhiliang Zhu
<zchl.j...@yahoo.com.INVALID>
On Tuesday, November 10, 2015 11:42 AM, Deng Ching-Mallete
<och...@apache.org> wrote:
Hi Zhiliang,
You should be able to see them in the executor logs, which you can view via the
Spark UI, in the Executors page (stderr log).
HTH,Deng
On Tue, Nov 10, 2015 at 11:33 AM, Zhiliang Zhu
Hi All,
I need debug spark job, my general way is to print out the log, however, some
bug is in spark functions as mapPartitions etc, and not any log printed from
those functionscould be found...Would you help point what is way to the log in
the spark own function as mapPartitions? Or, what is
e able to see them in the executor logs, which you can view via the
Spark UI, in the Executors page (stderr log).
HTH,Deng
On Tue, Nov 10, 2015 at 11:33 AM, Zhiliang Zhu <zchl.j...@yahoo.com.invalid>
wrote:
Hi All,
I need debug spark job, my general way is to print out the log, howeve
Also for Spark UI , that is, log from other places could be found, but the log
from the functions as mapPartitions could not.
On Tuesday, November 10, 2015 11:52 AM, Zhiliang Zhu <zchl.j...@yahoo.com>
wrote:
Dear Ching-Mallete ,
There are machines master01, master02 and ma
Hi Ching-Mallete,
I have found the log and the reason for that.
Thanks a lot!Zhiliang
On Tuesday, November 10, 2015 12:23 PM, Zhiliang Zhu
<zchl.j...@yahoo.com.INVALID> wrote:
Also for Spark UI , that is, log from other places could be found, but the
log from the fun
breeze for the enhancememt.
Where is the API or link site for the breeze quadratic minimizer integrated
with spark?And where is the breeze lpsolver...
Alternatively you can use breeze lpsolver as well that uses simplex from apache
math.
Thank you,Zhiliang
On Nov 4, 2015 1:05 AM, "Z
ple integration
> with spark. ecos runs as jni process in every executor.
>
> On Nov 1, 2015 9:52 AM, "Zhiliang Zhu" <zchl.j...@yahoo.com.invalid> wrote:
>>
>> Hi Ted Yu,
>>
>> Thanks very much for your kind reply.
>> Do you just mean that in s
ecos runs as jni process in every executor.
>
> On Nov 1, 2015 9:52 AM, "Zhiliang Zhu" <zchl.j...@yahoo.com.invalid> wrote:
>>
>> Hi Ted Yu,
>>
>> Thanks very much for your kind reply.
>> Do you just mean that in spark there is no specific
Hi All,
I would like to filter some elements in some given RDD, only the needed left,
at the time the row number of the result RDD is smaller.
Then I select filter function, however, by test, filter function would only
accept Boolean type, that is to say, will only JavaRDDbe returned for
, but currently, there is no open source
implementation in Spark.
Sincerely,
DB Tsai
--
Web: https://www.dbtsai.com
PGP Key ID: 0xAF08DF8D
On Sun, Nov 1, 2015 at 9:22 AM, Zhiliang Zhu <zchl.j...@yahoo.com> wrote:
> Dear All,
>
> As for
Dear All,
As for N dimension linear regression, while the labeled training points number
(or the rank of the labeled point space) is less than N, then from perspective
of math, the weight of the trained linear model may be not unique.
However, the output of model.weight() by spark may be with
, 2015 at 9:37 AM, Zhiliang Zhu <zchl.j...@yahoo.com.invalid>
wrote:
Dear All,
As I am facing some typical linear programming issue, and I know simplex method
is specific in solving LP question, I am very sorry that whether there is
already some mature package in spark about simplex method.
Dear All,
As I am facing some typical linear programming issue, and I know simplex method
is specific in solving LP question, I am very sorry that whether there is
already some mature package in spark about simplex method...
Thank you very much~Best Wishes!Zhiliang
Hi All,
There is some file with line number N + M,, as I need to read the first N lines
into one RDD .
1. i) read all the N + M lines as one RDD, ii) select the RDD's top N rows, may
be some one solution;2. if introduced some broadcast variable set N, then it is
used to decide while map the
Dear All,
I will program a small project by spark, and the run speed is big concern.
I have a question, since RDD is always big on the cluster, is it proper to make
RDD variable as parameter transferred during function call ?
Thank you,Zhiliang
ous, are you trying to solve systems of linear equations? If
so, you can probably try breeze.
On Sun, Oct 25, 2015 at 9:10 PM, Zhiliang Zhu
<zchl.j...@yahoo.com.invalid> wrote:
>
>
>
> On Monday, October 26, 2015 11:26 AM, Zhiliang Zhu
> <zchl.j...@yahoo.com.INVALID> wrote:
&
des an intercept in the model, e.g.
label = intercept + features dot weight
To get the result you want, you need to force the intercept to be zero.
Just curious, are you trying to solve systems of linear equations? If
so, you can probably try breeze.
On Sun, Oct 25, 2015 at 9:10 PM, Zhil
On Monday, October 26, 2015 11:26 AM, Zhiliang Zhu
<zchl.j...@yahoo.com.INVALID> wrote:
Hi DB Tsai,
Thanks very much for your kind help. I get it now.
I am sorry that there is another issue, the weight/coefficient result is
perfect while A is triangular matrix, however,
B Tsai
--
Web: https://www.dbtsai.com
PGP Key ID: 0xAF08DF8D
On Sun, Oct 25, 2015 at 10:14 AM, Zhiliang Zhu
<zchl.j...@yahoo.com.invalid> wrote:
> Dear All,
>
> I have some program as below which makes me very much confused and
> inscrutable, it is about m
Dear All,
I have some program as below which makes me very much confused and inscrutable,
it is about multiple dimension linear regression mode, the weight / coefficient
is always perfect while the dimension is smaller than 4, otherwise it is wrong
all the time.Or, whether the
Hi Sujit, and All,
Currently I lost in large difficulty, I am eager to get some help from you.
There is some big linear system of equations as:Ax = b, A with N number of row
and N number of column, N is very large, b = [0, 0, ..., 0, 1]TThen, I will
sovle it to get x = [x1, x2, ..., xn]T.
The
he.org/docs/1.2.0/mllib-dimensionality-reduction.html[2]
http://math.stackexchange.com/questions/458404/how-can-we-compute-pseudoinverse-for-any-matrix
On Fri, Oct 23, 2015 at 2:19 AM, Zhiliang Zhu <zchl.j...@yahoo.com> wrote:
Hi Sujit, and All,
Currently I lost in large difficulty, I am
Dear All,
I am new for spark ml.
There is some project for me, for some given math model and I would like to get
its optimized solution.It is very similar with spark mllib application.
However, the key problem for me is that the given math model is not obviously
belonging to the models ( as
Dear All,
I would like to use spark ml to develop some project related with optimization
algorithm, however, in spark 1.4.1 it seems that under ml's optimizer there are
only about 2 optimization algorithms.
My project may needs more kinds of optimization algorithms, then how would I
use spark
Hi All,
Would some expert help me some about the issue...
I shall appreciate you kind help very much!
Thank you!
Zhiliang
On Sunday, September 27, 2015 7:40 PM, Zhiliang Zhu
<zchl.j...@yahoo.com.INVALID> wrote:
Hi Alexis, Gavin,
Thanks very much for your kind comm
It seems that is due to spark SPARK_LOCAL_IP setting.export
SPARK_LOCAL_IP=localhost
will not work.
Then, how it would be set.
Thank you all~~
On Friday, September 25, 2015 5:57 PM, Zhiliang Zhu
<zchl.j...@yahoo.com.INVALID> wrote:
Hi Steve,
Thanks a lot for your
,or
for some other reasons...
This issue is urgent for me, would some expert provide some help about this
problem...
I will show sincere appreciation towards your help.
Thank you!Best Regards,Zhiliang
On Friday, September 25, 2015 7:53 PM, Zhiliang Zhu
<zchl.j...@yahoo.com.INVALID>
Hi All,
I would like to submit spark job on some another remote machine outside the
cluster,I also copied hadoop/spark conf files under the remote machine, then
hadoopjob would be submitted, but spark job would not.
In spark-env.sh, it may be due to that SPARK_LOCAL_IP is not properly set,or
on linux command side?
Best Regards,Zhiliang
On Saturday, September 26, 2015 10:07 AM, Gavin Yue
<yue.yuany...@gmail.com> wrote:
Print out your env variables and check first
Sent from my iPhone
On Sep 25, 2015, at 18:43, Zhiliang Zhu <zchl.j...@yahoo.com.INVALID> wrote:
Hi all,
The spark job will run on yarn. While I do not set SPARK_LOCAL_IP any, or just
set asexport SPARK_LOCAL_IP=localhost #or set as the specific node ip on
the specific spark install directory
It will work well to submit spark job on master node of cluster, however, it
will fail by
And the remote machine is not in the same local area network with the cluster .
On Friday, September 25, 2015 12:28 PM, Zhiliang Zhu
<zchl.j...@yahoo.com.INVALID> wrote:
Hi Zhan,
I have done that as your kind help.
However, I just could use "hadoop fs -ls/-mkdir/-rm XX
ste...@hortonworks.com> wrote:
On 25 Sep 2015, at 05:25, Zhiliang Zhu <zchl.j...@yahoo.com.INVALID> wrote:
However, I just could use "hadoop fs -ls/-mkdir/-rm XXX" commands to operate at
the remote machine with gateway,
which means the namenode is reachable; all those commands
nks.
Zhan Zhang
On Sep 22, 2015, at 8:14 PM, Zhiliang Zhu <zchl.j...@yahoo.com> wrote:
Hi Zhan,
Yes, I get it now.
I have not ever deployed hadoop configuration locally, and do not find the
specific doc, would you help provide the doc to do that...
Thank you,Zhiliang
On Wednesday, Septemb
Hi All,
There are two RDDs : RDD rdd1, and RDD rdd2,that
is to say, rdd1 and rdd2 are similar with DataFrame, or Matrix with same row
number and column number.
I would like to get RDD rdd3, each element in rdd3 is the
subtract between rdd1 and rdd2 of thesame position,
there is matrix add API, might map rdd2 each row element to be negative , then
make rdd1 and rdd2 and call add ?
Or some more ways ...
On Wednesday, September 23, 2015 3:11 PM, Zhiliang Zhu
<zchl.j...@yahoo.com> wrote:
Hi All,
There are two RDDs : RDD<Array> rdd1, a
ray(0.0, -8.0, 0.0))
-sujit
On Wed, Sep 23, 2015 at 12:23 AM, Zhiliang Zhu <zchl.j...@yahoo.com> wrote:
there is matrix add API, might map rdd2 each row element to be negative , then
make rdd1 and rdd2 and call add ?
Or some more ways ...
On Wednesday, September 23, 2015 3:11 P
Dear Experts,
Spark job is running on the cluster by yarn. Since the job can be submited at
the place on the machine from the cluster,however, I would like to submit the
job from another machine which does not belong to the cluster.I know for this,
hadoop job could be done by way of another
Sep 22, 2015, at 7:49 PM, Zhiliang Zhu <zchl.j...@yahoo.com> wrote:
Hi Zhan,
Thanks very much for your help comment.I also view it would be similar to
hadoop job submit, however, I was not deciding whether it is like that whenit
comes to spark.
Have you ever tried that for spark...Would
achine, and point the HADOOP_CONF_DIR in spark to the
configuration.
Thanks
Zhan Zhang
On Sep 22, 2015, at 6:37 PM, Zhiliang Zhu <zchl.j...@yahoo.com.INVALID> wrote:
Dear Experts,
Spark job is running on the cluster by yarn. Since the job can be submited at
the place on the machine from
former is used to access hdfs,
and the latter is used to launch application on top of yarn.
Then in the spark-env.sh, you add export HADOOP_CONF_DIR=/etc/hadoop/conf.
Thanks.
Zhan Zhang
On Sep 22, 2015, at 8:14 PM, Zhiliang Zhu <zchl.j...@yahoo.com> wrote:
Hi Zhan,
Yes, I get it now.
I h
Dear Sujit,
Since you are senior with Spark, I might not know whether it is convenient for
you to help comment some on my dilemma
while using spark to deal with R background application ...
Thank you very much!Zhiliang
On Tuesday, September 22, 2015 1:45 AM, Zhiliang Zhu <zch
Dear ,
I have took lots of days to think into this issue, however, without any
success...I shall appreciate your all kind help.
There is an RDD rdd1, I would like get a new RDD rdd2, each row in
rdd2[ i ] = rdd1[ i ] - rdd[i - 1] .What kinds of API or function would I use...
Thanks very
airRDD, and then use outer join.
Does that make sense?
On Mon, Sep 21, 2015 at 8:37 PM Zhiliang Zhu <zchl.j...@yahoo.com> wrote:
Dear Romi, Priya, Sujt and Shivaram and all,
I have took lots of days to think into this issue, however, without any enough
good solution...I shall appreciate your a
g for the
order of the items.
What exactly are you trying to accomplish?
Romi Kuntsman, Big Data Engineer
http://www.totango.com
On Mon, Sep 21, 2015 at 2:29 PM, Zhiliang Zhu <zchl.j...@yahoo.com.invalid>
wrote:
Dear ,
I have took lots of days to think into this issue, however, without
Dear Romi, Priya, Sujt and Shivaram and all,
I have took lots of days to think into this issue, however, without any enough
good solution...I shall appreciate your all kind help.
There is an RDD rdd1, and another RDD rdd2,
(rdd2 can be PairRDD, or DataFrame with two columns
park/mllib/rdd/SlidingRDD.html
So maybe something like this:
new SlidingRDD(rdd1, 2, ClassTag$.apply(Class))
-sujit
On Mon, Sep 21, 2015 at 9:16 AM, Zhiliang Zhu <zchl.j...@yahoo.com> wrote:
Hi Sujit,
I must appreciate your kind help very much~
It seems to be OK, however, do you know the
, September 21, 2015 11:48 PM, Sujit Pal <sujitatgt...@gmail.com>
wrote:
Hi Zhiliang,
Would something like this work?
val rdd2 = rdd1.sliding(2).map(v => v(1) - v(0))
-sujit
On Mon, Sep 21, 2015 at 7:58 AM, Zhiliang Zhu <zchl.j...@yahoo.com.invalid>
wrote:
Hi Romi,
T
com>
wrote:
Hi Zhiliang,
Would something like this work?
val rdd2 = rdd1.sliding(2).map(v => v(1) - v(0))
-sujit
On Mon, Sep 21, 2015 at 7:58 AM, Zhiliang Zhu <zchl.j...@yahoo.com.invalid>
wrote:
Hi Romi,
Thanks very much for your kind help comment~~
In fact there is some
98 matches
Mail list logo