Hi Dhiraj,

This is a very interesting proposal and it seems you've got a very
good understanding of the issues. My main concern is more on our side,
that this does not seem like such a high priority issue. We haven't
seen any user requests for such functionality. Although admittedly I
think we hadn't given much thought to how non-western languages
interact with the search and probably non-english speaking users are
less likely to use our english-language mailing list or forum.

I also think a lot of users are used to full text search in
applications and know how to work with it to get the results they
want. For example, with "Christina" and "Cristina" people type
"istina", knowing that's unlikely to appear in other words. Or more
simply just use the album title or track name if they aren't sure of
the spelling of the artist. Of course this isn't an ideal solution but
I suspect it works well for a lot of people.

That said, I'm impressed with the level of thought and detail you put
into this and I would definitely encourage you to apply for GSoC 2010
with Mixxx. But you should be prepared to justify strongly why this is
important for Mixxx. I would also suggest that before you apply, you
think about whether there are any higher priority projects you would
be equally interested in working on. Either way though we'll be
looking forward to your application.

Thanks,

Adam Davison

On 28 March 2010 07:12, Dhiraj Lohiya <[email protected]> wrote:
> Please find replies inline.
> On Sun, Mar 28, 2010 at 1:23 AM, Lukas Smith <[email protected]> wrote:
>>
>> On 27.03.2010, at 20:51, Owen Williams wrote:
>>
>> >
>> >>  As for actually implementing it yourself from scratch it probably
>> >> will have to be done using sqlite_create_function:
>> >> http://www.sqlite.org/c3ref/create_function.html.
>> >>
>> >
>> > I have experience with sqlite and full text search, and I didn't get
>> > good
>> > results with fts.  I had to use an external library called Xapian that
>> > worked much, much better. So I would consider that library even if it
>> > means adding a dependency.
>>
>>
>> well for something like soundex or double methaphone you dont really need
>> full text search .. however both algorithms really only work well for single
>> words, which makes things a bit tricky, since you would have to maintain a
>> separate table with the hashes for each word.
>>
>
> Can we do it in the following way:
> When loading new songs in the database, the a custom hash column would have
> the full text hash in the following manner:
> The equivalence classes for similar sounds are the classes which have a list
> of same sounding phonetic substrings. They are represented as alphabets
> since we have 52 english alphabets (rather than 10 numbers) considering
> upper and lower case and generally we only require 20-23 of those alphabets.
> Substrings are formed as continuous vowels and consonants sequence.
> Some equivalent classes for hindi based on the Hindi Phonology (devnagri
> script) (wrt  the example song name encoding below):
>
> p -> equivalent class P
> y | yy -> equivalent class Y
> a | aa -> equivalent class A
> r | rr -> equivalent class R
> k | c | q |ck -> equivalent class C
> m -> equivalent class M
> i | e | ee -> equivalent class E
> n | kn -> equivalent class N
>
> So, if the name of song entered is  "pyaar kameena", it is stored in the
> form of it's equivalent class as -> "PYAR CAMENA".
> Now when we take the search query from the user, we encode it according to
> this logic and then search. So whether the search query is any of the
> following
> pyar kamina | pyaar kamine | pyar kameena| pyar kaminaa etc. -> "PYAR
> CAMENA"
> The encoding for all of these will be same and now this can be easily
> searched using the FTS. (If Xapian works better than the default, we could
> use that).
> Now this seems to be scalable even if the number of songs in the list is
> more than 1 lakh.[Citation needed]
> Please Feel free to tear down at any of the above points. Thanks Mad Jester,
> Lukas and Owen for the continuous feedback and support.
> --
> Regards
> Dhiraj Lohiya
> IRC nick: Dj
> ------------------------------------------------------------------------------
> Download Intel&#174; Parallel Studio Eval
> Try the new software tools for yourself. Speed compiling, find bugs
> proactively, and fine-tune applications for parallel performance.
> See why Intel Parallel Studio got high marks during beta.
> http://p.sf.net/sfu/intel-sw-dev
> _______________________________________________
> Mixxx-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/mixxx-devel
>
>

------------------------------------------------------------------------------
Download Intel&#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
Mixxx-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/mixxx-devel

Reply via email to