66
Sent from my MetroPCS 4G LTE Android deviceOn Jun 30, 2017 5:32 AM, Bernat Gel
wrote:
>
> Ok, so it seems more like a bug somewhere than something I falied to
> understand, then. One of the surprises for me is that shuffling the data so
> the misses do not happen one after the other seems t
Ok, that makes sense
In my current use case I think I'll be able to filter out first the
elements that will miss, so this behaviour is not triggered.
But it's good to know this happens so I can try to avoid it in the future.
Thanks.
Bernat
*Bernat Gel Moreno*
Bioinformatician
Hereditary C
The reason it's faster when shuffled vs. all that end is that when a
miss happens R compares the string to all strings before it in the
subscript. So it's a lot worse to have a miss towards the end.
As Martin wrote, there are basically two possible improvements that
are somewhat complementary:
1)
Ok, so it seems more like a bug somewhere than something I falied to
understand, then.
One of the surprises for me is that shuffling the data so the misses do
not happen one after the other seems to solve the issue...
Thanks,
Bernat
*Bernat Gel Moreno*
Bioinformatician
Hereditary Cancer Pr
Hi Bernat, Michael,
FWIW I reported this issue on R-devel a couple of times. Last time was
in 2013:
https://stat.ethz.ch/pipermail/r-devel/2013-May/066616.html
Cheers,
H.
On 06/29/2017 11:58 PM, Bernat Gel wrote:
Yes, that would explain part of the situation. But example cc5 shows
that hash
Yes, that would explain part of the situation. But example cc5 shows
that hash misses would account only for part of the time.
Thanks for taking a look into it
Bernat
*Bernat Gel Moreno*
Bioinformatician
Hereditary Cancer Program
Program of Predictive and Personalized Medicine of Cancer (PMPP
Preliminary analysis suggests that this is due to hash misses. When
that happens, R ends up doing costly string comparisons that are on
the order of n^2 where 'n' is the length of the subscript. Looking
into it.
On Thu, Jun 29, 2017 at 10:43 AM, Bernat Gel wrote:
> Hi all,
>
> This is not strictl