Re: [R] Mathematical working procedure of imputation methods (medianImpute, knnImpute, and bagImpute) in caret package R

2022-09-22 Thread K Purna Prakash
Thanks, I'll check them out.

On Wed, Sep 21, 2022, 03:16 Bert Gunter  wrote:

> R is open source. Look at the code and read it.
> Alternatively, look at references for all of this. e.g. on Wikipedia or
> via web search. We generally do not provide statistical instruction on this
> list.
>
> Bert
>
> On Tue, Sep 20, 2022 at 2:20 PM K Purna Prakash 
> wrote:
>
>> Dear Sir/Madam,
>> Greetings!!!
>>
>> Kindly provide the detailed internal mathematical working mechanism of the
>> following median, KNN, and bagging imputation methods available in caret
>> package R.
>>
>>  preProcess(train_data, method = "medianImpute")
>>  preProcess(train_data, method = "knnnImpute")
>>  preProcess(train_data method = "bagImpute")
>>
>> The details provided by you will help me a lot for a better understanding
>> of these imputation methods especially while dealing with large sets of
>> data.
>>
>> I will look forward to hearing from you.
>>
>> Thanks and regards,
>> K. Purna Prakash.
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Mathematical working procedure of imputation methods (medianImpute, knnImpute, and bagImpute) in caret package R

2022-09-20 Thread K Purna Prakash
Dear Sir/Madam,
Greetings!!!

Kindly provide the detailed internal mathematical working mechanism of the
following median, KNN, and bagging imputation methods available in caret
package R.

 preProcess(train_data, method = "medianImpute")
 preProcess(train_data, method = "knnnImpute")
 preProcess(train_data method = "bagImpute")

The details provided by you will help me a lot for a better understanding
of these imputation methods especially while dealing with large sets of
data.

I will look forward to hearing from you.

Thanks and regards,
K. Purna Prakash.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Mathematical working procedure of duplicated() function in r

2020-08-04 Thread K Purna Prakash
Dear Sir(s),
I request you to provide the detailed* internal mathematical working
mechanism of the following function *for better understanding.
*x[duplicated(x) | duplicated(x, fromLast=TRUE), ]*
I am having some confusion in understanding how duplicates are being
identified when thousands of records are there.
I will look for a positive response.
Thank you,
K.Purna Prakash.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Request for unsubscribe from this forum.

2013-01-27 Thread Purna chander
Dear admin members,


My Inbox is being flooded with the posts every time. As a reason, I
wish to unsubscribe from this forum.

Can you suggest me how to do that.

Regards,
Purna

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to remove the vertical space between two graps

2013-01-22 Thread Purna chander
Hi,

I have created a barplot using the following code.

a-c(11,23,15,34,42,31)
m-matrix(a,nrow=2)
m[2,]-(-1)*m[2,]

par(mar=c(4,4,4,0))
barplot(m[2,],horiz=T)

par(mar=c(4,0,4,2))
barplot(m[1,],horiz=T,col=black)

and the plot obtained is shown in plot1.tiff.

I was not willing to see the gap (vertical space) between two graphs.
How can I achieve it?


Further I tried to achieve my goal in a single plot, for which I tried
this code:

a-c(11,23,15,34,42,31)
m-matrix(a,nrow=2)
m[2,]-(-1)*m[2,]

barplot(m,horiz=T,beside=T)

and the plot obtained is showed in plot2.tiff

in the second attempt I'm able to place the bars next to each other
using beside=T argument. However, I fail when I use beside=F
argument (obtained plot3.tiff with this).

Can you suggest me in achieving my goal (similar to plot2 with no
vertical space)?

Regards,
Purna
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] problem in finding sizes of objects using a for loop

2012-10-25 Thread Purna chander
Dear All,

I wanted to extract the sizes of all created objects. For E.g when I
created 2 objects(x and y), I got their sizes using the following
code:

 x-rnorm(1)
 y-runif(100,min=40,max=1000)
 ls()
[1] x y
 object.size(x)
80024 bytes
 object.size(y)
824 bytes

However, I was unable to get their sizes when I used a for loop in the
following way:

 objects-ls()
 for (i in seq_along(objects)){
+   print(c(objects[i],object.size(objects[i])))
+
+ }
[1] x  64
[1] y  64


The result obtained by me is wrong in second case.

I understood that variables x and y are treated as characters. But to
rectify this problem.

Regards,
Purna

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Regarding the memory allocation problem

2012-10-25 Thread Purna chander
Dear All,


My main objective was to compute the distance of 10 vectors from a
set having 900 other vectors. I've a file named seq_vec containing
10 records and 256 columns.
While computing, the memory was not sufficient and resulted in error
cannot allocate vector of size 152.1Mb

So I've approached the problem in the following:
Rather than reading the data completely at a time, I read the data in
chunks of 2 records using scan() function. After reading each
chunk, I've computed distance of each of these vectors with a set of
another vectors.

Even though I was successful in computing the distances for first 3
chunks, I obtained similar error (cannot allocate vector of size
102.3Mb).

Q) Here what I could not understand is, how come memory become
insufficient when dealing with 4th chunk?
Q) Suppose if i computed a matrix 'm' during calculation associated
with chunk1, then is this matrix not replaced when I again compute 'm'
when dealing with chunk 2?


Regards,
Purna

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Any better way of optimizing time for calculating distances in the mentioned scenario??

2012-10-12 Thread Purna chander
Dear All,

I'm dealing with a case, where 'manhattan' distance of each of 100
vectors is calculated from 1 other vectors. For achieving this,
following 4 scenarios are tested:

1) scenario 1:
 x-read.table(query.vec)
 v-read.table(query.vec2)

 d-matrix(nrow=nrow(v),ncol=nrow(x))
 for (i in 1:nrow(v)){
  + d[i,]- 
sapply(1:nrow(x),function(z){dist(rbind(v[i,],x[z,]),method=manhattan)})
  + }
 print(d[1,1:10])

time taken for running the code is :
real1m33.088s
user1m32.287s
sys 0m0.036s

2) scenario2:

 x-read.table(query.vec)
 v-read.table(query.vec2)
 v-as.matrix(v)
 d-matrix(nrow=nrow(v),ncol=nrow(x))
 for (i in 1:nrow(v)){
   + tmp_m-matrix(rep(v[i,],nrow(x)),nrow=nrow(x),byrow=T)
   + d[i,]- rowSums(abs(tmp_m - x))
   + }
 print(d[1,1:10])

time taken for running the code is:
real0m0.882s
user0m0.854s
sys 0m0.025s

3) scenario3:

 x-read.table(query.vec)
 v-read.table(query.vec2)
 v-as.matrix(v)
 d-matrix(nrow=nrow(v),ncol=nrow(x))
 for (i in 1:nrow(v)){
  + 
d[i,]-sapply(1:nrow(x),function(z){dist(rbind(v[i,],x[z,]),method=manhattan)})
  + }
 print(d[1,1:10])

time taken for running the code is:
real1m3.817s
user1m3.543s
sys 0m0.031s

4) scenario4:
 x-read.table(query.vec)
 v-read.table(query.vec2)
 v-as.matrix(v)
 d-dist(rbind(v,x),method=manhattan)
 m-as.matrix(d)
 m2-m[1:nrow(v),(nrow(v)+1):nrow(x)]
 print(m2[1,1:10])

time taken for running the code:
real0m0.445s
user0m0.401s
sys 0m0.041s


Queries:
1) Though scenario 4 is optimum, this scenario failed when matrix 'v'
having more no. of rows. An error occurred while converting distance
object 'd' to a matrix 'm'.
 For E.g:  m-as.matrix(d)
   the above command resulted in error: Error: cannot allocate
vector of size 922.7 MB.

So, what can be done to convert a larger dist object into a matrix or
how allocation size can be increased?

2) Here I observed that 'dist()' function calculates the distances
across all vectors present in a given matrix or dataframe. Is it not
possible to calculate distances of specific vectors from other vectors
present in a matrix using 'dist()' function? Which means, suppose if a
matrix 'x' having 20 rows, is it not possible using 'dist()' to
calculate only distance of 1st row vector from other 19 vectors.

3) Any other ideas to optimize the problem i'm facing with.

Regards,
Purnachander

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Text file: multiple matrix

2012-10-10 Thread Purna chander
Hi,

You can try this and see. I'm assuming that the initial text file
named test.txt.

x-read.table(test.txt,header=T)  # if headers are present in test.txt
  or
x-read.table(test.txt)

# Actually, read.table() command skips the blank lines.

n-256

for (i in 1:100){
  filename=paste(file_,i,.txt,sep=)
  m-x[((i-1)*256+1):(i*256),]
  write.table(m,filename,row.names=F,col.names=F)
}

Regards,
Purna


On 10/9/12, ludovico ludovicofr...@hotmail.it wrote:
 Hi there! I'm a newbie in R
 This is my problem: I have a txt file composed by 100 matrix (256x256)
 separated by a blank line! How can I save automatically the matrix in
 separated txt file (100)?
 e.g.
 1° matrix from line 1 to line 256
 257 blank line
 2°matrix from line 258 to line 513
 514 blank line
 3° matrix from line 515 to line 770
 771 blank line
 4° matrix from line 772 to line 1027..
 Thanks





 --
 View this message in context:
 http://r.789695.n4.nabble.com/Text-file-multiple-matrix-tp4645551.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Any better way of optimizing time for calculating distances in the mentioned scenario??

2012-10-08 Thread Purna chander
Dear All,

I'm dealing with a case, where 'manhattan' distance of each of 100
vectors is calculated from 1 other vectors. For achieving this,
following 4 scenarios are tested:

1) scenario 1:
 x-read.table(query.vec)
 v-read.table(query.vec2)

 d-matrix(nrow=nrow(v),ncol=nrow(x))
 for (i in 1:nrow(v)){
  + d[i,]- 
sapply(1:nrow(x),function(z){dist(rbind(v[i,],x[z,]),method=manhattan)})
  + }
 print(d[1,1:10])

time taken for running the code is :
real1m33.088s
user1m32.287s
sys 0m0.036s

2) scenario2:

 x-read.table(query.vec)
 v-read.table(query.vec2)
 v-as.matrix(v)
 d-matrix(nrow=nrow(v),ncol=nrow(x))
 for (i in 1:nrow(v)){
   + tmp_m-matrix(rep(v[i,],nrow(x)),nrow=nrow(x),byrow=T)
   + d[i,]- rowSums(abs(tmp_m - x))
   + }
 print(d[1,1:10])

time taken for running the code is:
real0m0.882s
user0m0.854s
sys 0m0.025s

3) scenario3:

 x-read.table(query.vec)
 v-read.table(query.vec2)
 v-as.matrix(v)
 d-matrix(nrow=nrow(v),ncol=nrow(x))
 for (i in 1:nrow(v)){
  + 
d[i,]-sapply(1:nrow(x),function(z){dist(rbind(v[i,],x[z,]),method=manhattan)})
  + }
 print(d[1,1:10])

time taken for running the code is:
real1m3.817s
user1m3.543s
sys 0m0.031s

4) scenario4:
 x-read.table(query.vec)
 v-read.table(query.vec2)
 v-as.matrix(v)
 d-dist(rbind(v,x),method=manhattan)
 m-as.matrix(d)
 m2-m[1:nrow(v),(nrow(v)+1):nrow(x)]
 print(m2[1,1:10])

time taken for running the code:
real0m0.445s
user0m0.401s
sys 0m0.041s


Queries:
1) Though scenario 4 is optimum, this scenario failed when matrix 'v'
having more no. of rows. An error occurred while converting distance
object 'd' to a matrix 'm'.
 For E.g:  m-as.matrix(d)
   the above command resulted in error: Error: cannot allocate
vector of size 922.7 MB.

So, what can be done to convert a larger dist object into a matrix or
how allocation size can be increased?

2) Here I observed that 'dist()' function calculates the distances
across all vectors present in a given matrix or dataframe. Is it not
possible to calculate distances of specific vectors from other vectors
present in a matrix using 'dist()' function? Which means, suppose if a
matrix 'x' having 20 rows, is it not possible using 'dist()' to
calculate only distance of 1st row vector from other 19 vectors.

3) Any other ideas to optimize the problem i'm facing with.

Regards,
Purnachander

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] delete rows whose sum is X

2011-03-04 Thread purna
Rnoob here.
I have a matrix of zeroes ond ones. I want to delete the rows whose sum of
values is not =5, alternatively extract the rows who sum up to 5.
 
Thank you/Mikael

--
View this message in context: 
http://r.789695.n4.nabble.com/delete-rows-whose-sum-is-X-tp3335254p3335254.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] matrix help

2010-09-26 Thread purna

Anyone know how write a function that solves: 

(1 + c)x1   +x2 +x3 = 5
x1+(1 + c)x2+x3 = 5 + 2c
x1+x2 +(1 + c)x3= 5 + 3c,

where c is a small constant, for 1000 equidistant values c = (10^-14,
2*10^-14, ..., 10^-11) by using cholesky decomposition? 

/P


-- 
View this message in context: 
http://r.789695.n4.nabble.com/matrix-help-tp2714378p2714378.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.