[algogeeks] Re: Data Structure for URL matching

2010-08-19 Thread Chi
Hi Amit,

are you some sort of a genius or what do I expect now!? Can you answer
my question?

Best regards,

On Aug 17, 12:24 am, Amit Jaspal amitjaspal...@gmail.com wrote:
 http://www.ahhf45.com/info/Data_Structures_and_Algorithms/resources/t...

 Refer this link.



 On Mon, Aug 16, 2010 at 10:29 PM, Chi c...@linuxdna.com wrote:
  I'm not sure what you want. I have post a solution to search for
  wildcards in tries. Now you claim to do it better with TS-Trees. Do
  you know who to compute a reverse TS-Tree? And why don't you try first
  to code a radix-tree, which is a compressed trie and build then the
  reverse radix-tree? Here is a solution to code a radix-tree, crit-bit-
  tree, pat-tree with mininmal understanding of comupter science:

  http://www.codeproject.com/KB/string/pat_and_huff.aspx

  IMHO I'm not sure why a Ternary-Search-Tree should be faster then a
  Radix-Tree? If a radix tree is already a binary-tree version of the
  trie, then can you explain me the advantage of a ternary-search-tree?

  On Aug 15, 8:31 pm, Amit Jaspal amitjaspal...@gmail.com wrote:
   Seems a gud idea .
   I have read we can do better with Ternary Search Trees .Can anybody has
  any
   idea about it?

   On Sun, Aug 15, 2010 at 11:44 PM, Chi c...@linuxdna.com wrote:
What you may need is a reverse trie. You may take a look at this:

http://phpir.com/tries-and-wildcards

On Aug 15, 3:21 pm, amit amitjaspal...@gmail.com wrote:
 In our indexes, we have millions of URLs each of which has a link to
 some page contents, that is, URL-contents. Now, suppose a user types
 a query with wild cards *, which represent 0 or multiple occurrences
 of any characters, how do you build the indexes such that such a type
 of query can be executed efficiently by finding all corresponding
URLs-contents efficiently. For example, given a queryhttp://www.*o*ve*
ou.com.

 You need to find iloveyou.com, itveabcu.com, etc.

--
You received this message because you are subscribed to the Google
  Groups
Algorithm Geeks group. To post to this group, send email
  toalgoge...@googlegroups.com.
To unsubscribe from this group, send email to
  algogeeks+unsubscr...@googlegroups.comalgogeeks%2bunsubscr...@googlegroups.com
  algogeeks%2bunsubscr...@googlegroups .com
.
For more options, visit this group at
   http://groups.google.com/group/algogeeks?hl=en.

   --
   Amit Jaspal
   Btech IT IIIT- Allahabad

  --
  You received this message because you are subscribed to the Google Groups
  Algorithm Geeks group.
  To post to this group, send email to algoge...@googlegroups.com.
  To unsubscribe from this group, send email to
  algogeeks+unsubscr...@googlegroups.comalgogeeks%2bunsubscr...@googlegroups.com
  .
  For more options, visit this group at
 http://groups.google.com/group/algogeeks?hl=en.

 --
 Amit Jaspal
 Btech IT IIIT- Allahabad

-- 
You received this message because you are subscribed to the Google Groups 
Algorithm Geeks group.
To post to this group, send email to algoge...@googlegroups.com.
To unsubscribe from this group, send email to 
algogeeks+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/algogeeks?hl=en.



Re: [algogeeks] Re: Data Structure for URL matching

2010-08-17 Thread Amit Jaspal
http://www.ahhf45.com/info/Data_Structures_and_Algorithms/resources/technical_artile/ternary_search_tree/terstree.htm


Refer this link.

On Mon, Aug 16, 2010 at 10:29 PM, Chi c...@linuxdna.com wrote:

 I'm not sure what you want. I have post a solution to search for
 wildcards in tries. Now you claim to do it better with TS-Trees. Do
 you know who to compute a reverse TS-Tree? And why don't you try first
 to code a radix-tree, which is a compressed trie and build then the
 reverse radix-tree? Here is a solution to code a radix-tree, crit-bit-
 tree, pat-tree with mininmal understanding of comupter science:

  http://www.codeproject.com/KB/string/pat_and_huff.aspx

 IMHO I'm not sure why a Ternary-Search-Tree should be faster then a
 Radix-Tree? If a radix tree is already a binary-tree version of the
 trie, then can you explain me the advantage of a ternary-search-tree?


 On Aug 15, 8:31 pm, Amit Jaspal amitjaspal...@gmail.com wrote:
  Seems a gud idea .
  I have read we can do better with Ternary Search Trees .Can anybody has
 any
  idea about it?
 
 
 
 
 
  On Sun, Aug 15, 2010 at 11:44 PM, Chi c...@linuxdna.com wrote:
   What you may need is a reverse trie. You may take a look at this:
 
   http://phpir.com/tries-and-wildcards
 
   On Aug 15, 3:21 pm, amit amitjaspal...@gmail.com wrote:
In our indexes, we have millions of URLs each of which has a link to
some page contents, that is, URL-contents. Now, suppose a user types
a query with wild cards *, which represent 0 or multiple occurrences
of any characters, how do you build the indexes such that such a type
of query can be executed efficiently by finding all corresponding
   URLs-contents efficiently. For example, given a queryhttp://www.*o*ve*
   ou.com.
 
You need to find iloveyou.com, itveabcu.com, etc.
 
   --
   You received this message because you are subscribed to the Google
 Groups
   Algorithm Geeks group. To post to this group, send email
 toalgoge...@googlegroups.com.
   To unsubscribe from this group, send email to
 algogeeks+unsubscr...@googlegroups.comalgogeeks%2bunsubscr...@googlegroups.com
 algogeeks%2bunsubscr...@googlegroups .com
   .
   For more options, visit this group at
  http://groups.google.com/group/algogeeks?hl=en.
 
  --
  Amit Jaspal
  Btech IT IIIT- Allahabad

 --
 You received this message because you are subscribed to the Google Groups
 Algorithm Geeks group.
 To post to this group, send email to algoge...@googlegroups.com.
 To unsubscribe from this group, send email to
 algogeeks+unsubscr...@googlegroups.comalgogeeks%2bunsubscr...@googlegroups.com
 .
 For more options, visit this group at
 http://groups.google.com/group/algogeeks?hl=en.




-- 
Amit Jaspal
Btech IT IIIT- Allahabad

-- 
You received this message because you are subscribed to the Google Groups 
Algorithm Geeks group.
To post to this group, send email to algoge...@googlegroups.com.
To unsubscribe from this group, send email to 
algogeeks+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/algogeeks?hl=en.



[algogeeks] Re: Data Structure for URL matching

2010-08-17 Thread Chi
Hi Amit,

it is not proven that a ternary-search-tree is faster:

 http://nicolas.lehuen.com/category/pytst/

From the article: The difference between pytst and ctst

This gives 11 nodes for 7 strings. I won’t waste time to draw the ternary 
search tree equivalent, so you’ll have to trust me, but you would need at 
least 25 nodes. Of course the node don’t contain the same things, but the 
payload/overhead ratio is highly in favor of ctst.

Maybe it is a little faster but it has a bigger space consumption.


On Aug 17, 12:24 am, Amit Jaspal amitjaspal...@gmail.com wrote:
 http://www.ahhf45.com/info/Data_Structures_and_Algorithms/resources/t...

 Refer this link.



 On Mon, Aug 16, 2010 at 10:29 PM, Chi c...@linuxdna.com wrote:
  I'm not sure what you want. I have post a solution to search for
  wildcards in tries. Now you claim to do it better with TS-Trees. Do
  you know who to compute a reverse TS-Tree? And why don't you try first
  to code a radix-tree, which is a compressed trie and build then the
  reverse radix-tree? Here is a solution to code a radix-tree, crit-bit-
  tree, pat-tree with mininmal understanding of comupter science:

  http://www.codeproject.com/KB/string/pat_and_huff.aspx

  IMHO I'm not sure why a Ternary-Search-Tree should be faster then a
  Radix-Tree? If a radix tree is already a binary-tree version of the
  trie, then can you explain me the advantage of a ternary-search-tree?

  On Aug 15, 8:31 pm, Amit Jaspal amitjaspal...@gmail.com wrote:
   Seems a gud idea .
   I have read we can do better with Ternary Search Trees .Can anybody has
  any
   idea about it?

   On Sun, Aug 15, 2010 at 11:44 PM, Chi c...@linuxdna.com wrote:
What you may need is a reverse trie. You may take a look at this:

http://phpir.com/tries-and-wildcards

On Aug 15, 3:21 pm, amit amitjaspal...@gmail.com wrote:
 In our indexes, we have millions of URLs each of which has a link to
 some page contents, that is, URL-contents. Now, suppose a user types
 a query with wild cards *, which represent 0 or multiple occurrences
 of any characters, how do you build the indexes such that such a type
 of query can be executed efficiently by finding all corresponding
URLs-contents efficiently. For example, given a queryhttp://www.*o*ve*
ou.com.

 You need to find iloveyou.com, itveabcu.com, etc.

--
You received this message because you are subscribed to the Google
  Groups
Algorithm Geeks group. To post to this group, send email
  toalgoge...@googlegroups.com.
To unsubscribe from this group, send email to
  algogeeks+unsubscr...@googlegroups.comalgogeeks%2bunsubscr...@googlegroups.com
  algogeeks%2bunsubscr...@googlegroups .com
.
For more options, visit this group at
   http://groups.google.com/group/algogeeks?hl=en.

   --
   Amit Jaspal
   Btech IT IIIT- Allahabad

  --
  You received this message because you are subscribed to the Google Groups
  Algorithm Geeks group.
  To post to this group, send email to algoge...@googlegroups.com.
  To unsubscribe from this group, send email to
  algogeeks+unsubscr...@googlegroups.comalgogeeks%2bunsubscr...@googlegroups.com
  .
  For more options, visit this group at
 http://groups.google.com/group/algogeeks?hl=en.

 --
 Amit Jaspal
 Btech IT IIIT- Allahabad

-- 
You received this message because you are subscribed to the Google Groups 
Algorithm Geeks group.
To post to this group, send email to algoge...@googlegroups.com.
To unsubscribe from this group, send email to 
algogeeks+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/algogeeks?hl=en.



[algogeeks] Re: Data Structure for URL matching

2010-08-16 Thread Chi
I'm not sure what you want. I have post a solution to search for
wildcards in tries. Now you claim to do it better with TS-Trees. Do
you know who to compute a reverse TS-Tree? And why don't you try first
to code a radix-tree, which is a compressed trie and build then the
reverse radix-tree? Here is a solution to code a radix-tree, crit-bit-
tree, pat-tree with mininmal understanding of comupter science:

 http://www.codeproject.com/KB/string/pat_and_huff.aspx

IMHO I'm not sure why a Ternary-Search-Tree should be faster then a
Radix-Tree? If a radix tree is already a binary-tree version of the
trie, then can you explain me the advantage of a ternary-search-tree?


On Aug 15, 8:31 pm, Amit Jaspal amitjaspal...@gmail.com wrote:
 Seems a gud idea .
 I have read we can do better with Ternary Search Trees .Can anybody has any
 idea about it?





 On Sun, Aug 15, 2010 at 11:44 PM, Chi c...@linuxdna.com wrote:
  What you may need is a reverse trie. You may take a look at this:

  http://phpir.com/tries-and-wildcards

  On Aug 15, 3:21 pm, amit amitjaspal...@gmail.com wrote:
   In our indexes, we have millions of URLs each of which has a link to
   some page contents, that is, URL-contents. Now, suppose a user types
   a query with wild cards *, which represent 0 or multiple occurrences
   of any characters, how do you build the indexes such that such a type
   of query can be executed efficiently by finding all corresponding
  URLs-contents efficiently. For example, given a queryhttp://www.*o*ve*
  ou.com.

   You need to find iloveyou.com, itveabcu.com, etc.

  --
  You received this message because you are subscribed to the Google Groups
  Algorithm Geeks group. To post to this group, send email 
  toalgoge...@googlegroups.com.
  To unsubscribe from this group, send email 
  toalgogeeks+unsubscr...@googlegroups.comalgogeeks%2bunsubscr...@googlegroups
   .com
  .
  For more options, visit this group at
 http://groups.google.com/group/algogeeks?hl=en.

 --
 Amit Jaspal
 Btech IT IIIT- Allahabad

-- 
You received this message because you are subscribed to the Google Groups 
Algorithm Geeks group.
To post to this group, send email to algoge...@googlegroups.com.
To unsubscribe from this group, send email to 
algogeeks+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/algogeeks?hl=en.



[algogeeks] Re: Data Structure for URL matching

2010-08-15 Thread Chi
What you may need is a reverse trie. You may take a look at this:

 http://phpir.com/tries-and-wildcards

On Aug 15, 3:21 pm, amit amitjaspal...@gmail.com wrote:
 In our indexes, we have millions of URLs each of which has a link to
 some page contents, that is, URL-contents. Now, suppose a user types
 a query with wild cards *, which represent 0 or multiple occurrences
 of any characters, how do you build the indexes such that such a type
 of query can be executed efficiently by finding all corresponding 
 URLs-contents efficiently. For example, given a queryhttp://www.*o*ve*ou.com.

 You need to find iloveyou.com, itveabcu.com, etc.

-- 
You received this message because you are subscribed to the Google Groups 
Algorithm Geeks group.
To post to this group, send email to algoge...@googlegroups.com.
To unsubscribe from this group, send email to 
algogeeks+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/algogeeks?hl=en.



[algogeeks] Re: Data Structure for URL matching

2010-08-15 Thread Chi
What you need is a reverse trie dictionary. You may take a look at
this:

 http://phpir.com/tries-and-wildcards

On Aug 15, 3:21 pm, amit amitjaspal...@gmail.com wrote:
 In our indexes, we have millions of URLs each of which has a link to
 some page contents, that is, URL-contents. Now, suppose a user types
 a query with wild cards *, which represent 0 or multiple occurrences
 of any characters, how do you build the indexes such that such a type
 of query can be executed efficiently by finding all corresponding 
 URLs-contents efficiently. For example, given a queryhttp://www.*o*ve*ou.com.

 You need to find iloveyou.com, itveabcu.com, etc.

-- 
You received this message because you are subscribed to the Google Groups 
Algorithm Geeks group.
To post to this group, send email to algoge...@googlegroups.com.
To unsubscribe from this group, send email to 
algogeeks+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/algogeeks?hl=en.



Re: [algogeeks] Re: Data Structure for URL matching

2010-08-15 Thread Amit Jaspal
Seems a gud idea .
I have read we can do better with Ternary Search Trees .Can anybody has any
idea about it?

On Sun, Aug 15, 2010 at 11:44 PM, Chi c...@linuxdna.com wrote:

 What you may need is a reverse trie. You may take a look at this:

  http://phpir.com/tries-and-wildcards

 On Aug 15, 3:21 pm, amit amitjaspal...@gmail.com wrote:
  In our indexes, we have millions of URLs each of which has a link to
  some page contents, that is, URL-contents. Now, suppose a user types
  a query with wild cards *, which represent 0 or multiple occurrences
  of any characters, how do you build the indexes such that such a type
  of query can be executed efficiently by finding all corresponding
 URLs-contents efficiently. For example, given a queryhttp://www.*o*ve*
 ou.com.
 
  You need to find iloveyou.com, itveabcu.com, etc.

 --
 You received this message because you are subscribed to the Google Groups
 Algorithm Geeks group.
 To post to this group, send email to algoge...@googlegroups.com.
 To unsubscribe from this group, send email to
 algogeeks+unsubscr...@googlegroups.comalgogeeks%2bunsubscr...@googlegroups.com
 .
 For more options, visit this group at
 http://groups.google.com/group/algogeeks?hl=en.




-- 
Amit Jaspal
Btech IT IIIT- Allahabad

-- 
You received this message because you are subscribed to the Google Groups 
Algorithm Geeks group.
To post to this group, send email to algoge...@googlegroups.com.
To unsubscribe from this group, send email to 
algogeeks+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/algogeeks?hl=en.