A very good point! (Although augmented and log-average tf both do some kind
of normalisation of the tf distribution before IDF weighting.)
___
scikit-learn mailing list
[email protected]
https://mail.python.org/mailman/listinfo/scikit-learn
It includes non-core points, but not points that are out of eps from any
core point. You can modify eps and min_samples. But perhaps you should just
choose a different clustering algorithm if this is behaviour you absolutely
do not want.
On 30 January 2018 at 23:24, AMIR SHANEHSAZZADEH <
amir.p.sh
Hi Yacine,
On 29/01/18 16:39, Yacine MAZARI wrote:
>> I wouldn't hate if length normalisation was added to
if it was shown that normalising before IDF
multiplication was more effective than (or complementary >> to) norming
afterwards.
I think this is one of the most important points here.
T
Okay, thanks for the replies.
@Joel: Should I go ahead and send a PR with the change to TfidfTransformer?
On Tue, Jan 30, 2018 at 5:27 AM, Joel Nothman
wrote:
> I don't think you will do this without an O(N) cost. The fact that it's
> done with a second pass is moot.
>
> My position stands: if
Hello,
I am working with the latest implementation of DBSCAN. I believe that
scikit-learn's implementation does not include non-core points in clusters.
This results in border points not being included in clusters. Is there any
way to remedy this issue so that border points are included in their
r