Re: Updating A Vector

2017-04-27 Thread Matthias Boehm
if your values in matrix2 are aligned as in your example, then you can do
the following (which works for arbitrary values in matrix1 but you could
simplify it if you have just 1s):

matrix1 = matrix1*(matrix2==0) + (matrix2!=0)*2;

The only problematic case would be special values such NaNs in matrix1
because they cannot be pruned away due to NaN * 0 = NaN, but you can use
our replace builtin function to eliminate NaNs before that.

Regards,
Matthias

On Thu, Apr 27, 2017 at 9:46 AM, arijit chakraborty 
wrote:

> Hi,
>
>
> I've 2 matrix:
>
>
> matrix1 = matrix(1, 10, 1)
>
> matrix2 = matrix(seq(1,10,1),1,1)
>
>
> matrix1 value will be updated based on matrix2. E.g. suppose in matrix2,
> only 2, 3, 4 position has value and rest has 0, then matrix1 value will be
> updated to, say 2, for position 2,3,4. So the new matrix looks like this:
>
>
> matrix1 = matrix("1 2 2 2 1 1 1 1 1 1", 10, 1)
>
> matrix2 = matrix("0 2 3 4 0 0 0 0 0 0", 10, 1)
>
>
> I used the following code to update the matrix:
>
>
> matrix2_1 = removeEmpty(target = matrix2, margin = "rows");
>
>
> for(k in 1:nrow(matrix2_1 )){
>
> matrix1 [as.scalar(matrix2_1 [k,]),] = 2
>
> }
>
> This code works, but I would like this calculation in matrix form. I tried
> it using "table" function. But I'm yet to understand fully the use of
> "table".  So unable to do it.
>
> Can anyone please help me to solve the problem?
>
> Thank you!
> Arijit
>
>


Re: Please reply ASAP : Regarding incubator systemml/breast_cancer project

2017-04-27 Thread dusenberrymw
Hi Aishwarya,

Yes, it is quite strange that Jupyter isn't running on the PySpark kernel even 
though it's being started in that manner.  The good news is that we do use this 
everyday, so once we find the root issue with your Jupyter, it should work 
great!  Let's try temporarily removing all of the existing Jupyter/IPython 
settings & kernels and basically start fresh.  Assuming you are on OS X / macOS 
or Linux, can you do the following? (Please double check the exact paths, as 
I'm typing on a phone.)

* Stop Jupyter, and make sure that it is not running.
* Temporarily remove the Jupyter kernels.  First, you will need to see where 
they are installed, and then just rename that path.
`jupyter kernelspec list`
# look at paths above.  For example, on macOS, it may be located at 
~/Library/Jupyter/kernels, and thus to move it, you would use the following. 
Update this as needed for the exact paths listed above 
`mv ~/Library/Jupyter/kernels ~/Library/Jupyter_OLD/kernels`
* Temporarily remove the Jupyter & IPython settings:
`mv ~/.jupyter ~/.jupyter_OLD`
`mv ~/.ipython ~/.ipython_OLD`
* Make sure Jupyter is up to date:
`pip3 install -U ipython jupyter`

After that, please ensure that Jupyter is not running, then start it in the 
context of PySpark as sent previously.  Once Jupyter is started this time, 
there should only be one kernel listed, and `sc` should be available.

Can you try that?

--

Mike Dusenberry
GitHub: github.com/dusenberrymw
LinkedIn: linkedin.com/in/mikedusenberry

Sent from my iPhone.


> On Apr 26, 2017, at 2:13 AM, Aishwarya Chaurasia  
> wrote:
> 
> Hi sir,
> The sc NameError persists.
> 
> (1) There is only one jupyter server running. And that was started with the
> pyspark command in the previous mail.
> (2) Two kernels are appearing in the change kernel option - Python3 and
> Python2. Tried with both of them and the result is the same.
> 
> How is jupyter not being able to run on the pyspark kernel when we have
> started the notebook with the pyspark command only?
> 
> Is it possible to create a .py file of MachineLearning.ipynb like was done
> with preprocessing.ipynb with explicitly creating a SparkContext() ?
> 
>> On 25-Apr-2017 11:57 PM,  wrote:
>> 
>> Hi Aishwarya,
>> 
>> Unfortunately this mailing list removes all images, so I can't view your
>> screenshot.  I'm assuming that it is the same issue with the missing
>> SparkContext `sc` object, but please let me know if it is a different
>> issue.  This sounds like it could be an issue with multiple kernels
>> installed in Jupyter.  When you start the notebook, can you see if there
>> are multiple kernels listed in the "Kernel" -> "Change Kernel" menu?  If
>> so, please try one of the other kernels to see if Jupyter is starting by
>> default with a non-spark kernel.  Also, is it possible that you have more
>> than one instance of the Jupyter server running?  I.e. for this scenario,
>> we start Jupyter itself directly via pyspark using the command sent
>> previously, whereas usually Jupyter can just be started with `jupyter
>> notebook`.  In the latter case, PySpark (and thus `sc`) would *not* be
>> available (unless you've set up special PySpark kernels separately).  In
>> summary, can you (1) check for other kernels via the menus, and (2) check
>> for other running Jupyter servers that are non-PySpark?
>> 
>> As for the other inquiry, great question!  When training models, it's
>> quite useful to track the loss and other metrics (i.e. accuracy) from
>> *both* the training and validation sets.  The reasoning is that it allows
>> for a more holistic view of the overall learning process, such as
>> evaluating whether any overfitting or underfitting is occurring.  For
>> example, say that you train a model and achieve an accuracy of 80% on the
>> validation set.  Is this good?  Is this the best that can be done?  Without
>> also tracking performance on the training set, it can be difficult to make
>> these decisions.  Say that you then measure the performance on the training
>> set and find that the model achieves 100% accuracy on that data.  That
>> might be a good indication that your model is overfitting the training set,
>> and that a combination of more data, regularization, and a smaller model
>> may be helpful in raising the generalization performance, i.e. the
>> performance on the validation set and future real examples on which you
>> wish to make predictions.  If on the other hand, the model achieved an 82%
>> on the training set, this could be a good indication that the model is
>> underfitting, and that a combination of a more expressive model and better
>> data could be helpful.  In summary, tracking performance on both the
>> training and validation datasets can be useful for determining ways in
>> which to improve the overall learning process.
>> 
>> 
>> - Mike
>> 
>> --
>> 
>> Mike Dusenberry
>> GitHub: github.com/dusenberrymw
>> LinkedIn: 

Updating A Vector

2017-04-27 Thread arijit chakraborty
Hi,


I've 2 matrix:


matrix1 = matrix(1, 10, 1)

matrix2 = matrix(seq(1,10,1),1,1)


matrix1 value will be updated based on matrix2. E.g. suppose in matrix2, only 
2, 3, 4 position has value and rest has 0, then matrix1 value will be updated 
to, say 2, for position 2,3,4. So the new matrix looks like this:


matrix1 = matrix("1 2 2 2 1 1 1 1 1 1", 10, 1)

matrix2 = matrix("0 2 3 4 0 0 0 0 0 0", 10, 1)


I used the following code to update the matrix:


matrix2_1 = removeEmpty(target = matrix2, margin = "rows");


for(k in 1:nrow(matrix2_1 )){

matrix1 [as.scalar(matrix2_1 [k,]),] = 2

}

This code works, but I would like this calculation in matrix form. I tried it 
using "table" function. But I'm yet to understand fully the use of "table".  So 
unable to do it.

Can anyone please help me to solve the problem?

Thank you!
Arijit