[PD] search plugin update (was: Re: reverse kickstarter update)

2013-09-15 Thread Jonathan Wilkes

Hi list,
 Attached is a first pass at using the Xapian backend to
search Pure Data docs.

What the revision does:
* simplifies building a search index.  It builds once, on the first
search, and all subsequent searches happen very fast.  Previously
it searched the docs themselves every single time and depended
on the OS caching the data, resulting in sluggish performance
especially on Windows.
* natural language, probabalistic searches.  The search terms
in the index were automatically chosen by the engine with no
customization, and already the results are decent.
* nearly no input errors.  Xapian has its own simple syntax, but
for most cases users can ignore it and type in natural language
searches (like Google).  And the few errors the user
can generate have meaningful feedback to the console.  Also,
since I'm passing the input as a string you don't have to worry
about malformed tcl lists or weird characters that previously caused
error.
* everything, including pd files, pdfs and html, is indexed properly
and so will get included in the results in the proper place.
* gives the ability to add results from a remote database with a
couple lines of code.
* allows the removal of Match all terms and Match whole words
checkbuttons, simplifying the interface.
* performs stemming out of the box-- that is, searching for
edit, the engine will take into account editing, edits, edited,
etc.

Installation for linux (Debian):
1) Make sure you have libxapian and tclxapian packages
installed.  Other distros probably have corresponding packages.
2) put search-plugin.tcl in the /startup directory, or if you're
using Pd vanilla just make sure it's in a directory that's specified
in the Path dialog.
3) Run Pd and click ctrl-h or choose Search from the Help
menu.

Further work that needs to be done:
* need to figure out where to create the database directory on
Linux, OSX, and Windows.  The directory needs to be read/writable.
Is there an easy way to do this?
* need a Cancel button next to the progressbar when indexing,
so the user can cancel a long index.

Further work that could be done:
* add pd meta tag/values to the index terms for each document.
This would make it possible to type keyword:foo or author:bar
to search based solely on that pd meta tag/value.
* add filenames to terms
* add object terms so the user can search pd patches for
a particular object instance, i.e., object:clip
* limit the document data in the database to pd meta tags/values
and other metadata.  Right now I'm storing the _entire_ doc text
in the database which obviously wastes space.
* xapian has all kinds of features, like suggesting related searches,
and realtime results.  The latter could be very handy for autocompletion
in object boxes, for example.
* could use the title of html files as description for better result 
descriptions

* could plug in to puredata.info to search for externals, plugins, etc.

As always, feedback welcome.  And feel free to donate some rice
and beans if you can!
https://jwilkes.nfshost.com/donations.php

Best,
Jonathan
# browse docs or search all the documentation using a regexp
# check the Help menu for the Browser item to use it

# todo: use xapian syntax for meta keywords
#keyword:foo
# todo: when cancelling a db index build, we need to remove
# the database completely
# todo: remove both checkbuttons-- not needed
# todo: do newline regsub and document parsing on indexing
# todo: make libdir listing check for duplicates
# todo: hook into the dialog_bindings
# TODO remove the doc_ prefix on procs where its not needed
# TODO enter and up/down/left/right arrow key bindings for nav

# redesign:
# [  search entry  ] Help
# [search] [filter]
#

package require Tk 8.5
package require pd_bindings
package require pd_menucommands
package require xapian 1.0.0

namespace eval ::dialog_helpbrowser2:: {

variable doctypes *.{pd,pat,mxb,mxt,help,txt,htm,html,pdf}

variable searchfont [list {DejaVu Sans}]
variable searchtext {}
variable search_history {}
variable count {}
# $i controls the build_index recursive loop
variable i
variable filelist {}
variable progress {}
variable navbar {}
variable genres
variable cancelled
variable database {}
}

## help browser and support functions #
proc ::dialog_helpbrowser2::open_helpbrowser {mytoplevel} {
if {[winfo exists $mytoplevel]} {
wm deiconify $mytoplevel
raise $mytoplevel
} else {
create_dialog $mytoplevel
}
}

proc ::dialog_helpbrowser2::create_dialog {mytoplevel} {
variable searchfont
variable selected_file
variable genres [list [_ All documents] \
[_ Object Help Patches] \
[_ All About Pd] \
[_ Tutorials] \
[_ Manual] \
[_ Uncategorized] \
]
variable count
foreach genre $genres {
	lappend 

Re: [PD] search plugin update (was: Re: reverse kickstarter update)

2013-09-15 Thread Dan Wilcox
On Sep 15, 2013, at 3:23 PM, pd-list-requ...@iem.at wrote:

 * need to figure out where to create the database directory on
 Linux, OSX, and Windows.  The directory needs to be read/writable.
 Is there an easy way to do this?

For Linux  Windows, why not put it in the same location as the pd settings 
file?

On OSX, I'd put it in ~/Library/Application Support/pd (or pd-extended).


Dan Wilcox
@danomatika
danomatika.com
robotcowboy.com





___
Pd-list@iem.at mailing list
UNSUBSCRIBE and account-management - 
http://lists.puredata.info/listinfo/pd-list


Re: [PD] Search plugin update

2013-01-21 Thread Hans-Christoph Steiner

This definitely sounds quite useful.

Scrolling to the selection is not something easy to do right now, but its
something that could be made easy to do.  Basically, if the selection is
tagged with a tag that marks it as the selection, then it would be easy to
find the selection object's location and scroll to it all in Tcl.

If you grep the pd-extended source for select_color you'll find the spots
that need to be changed.  I think the trickier bit might be removing the tag
once the selection is done.  This a patch I'd accept and would lobby to have
Miller include it also.  This is a behavior that Pd's find panel should also 
have.

.hc

On 01/21/2013 01:04 AM, Jonathan Wilkes wrote:
 I updated the homepage of the search plugin to point to a
 pd glossary that I wrote awhile back and forgot about.
 
 It's kind of neat-- you can add entries to the text file in
 doc/5.reference/glossary.txt and doc/5.reference/glossary.pd
 will parse the file, sort the entries in alphabetical order, and
 display them in the patch with links to objects related to the
 terms.  They probably need some work so feel free to make/
 suggest changes.
 
 
 Unfortunately ctrl-f Find won't scroll to the relevant part
 of a long patch if the match happens to be out of view.  Is there
 a way to fix this?
 
 -Jonathan
 
 
 ___
 Pd-list@iem.at mailing list
 UNSUBSCRIBE and account-management - 
 http://lists.puredata.info/listinfo/pd-list
 

___
Pd-list@iem.at mailing list
UNSUBSCRIBE and account-management - 
http://lists.puredata.info/listinfo/pd-list


[PD] Search plugin update

2013-01-20 Thread Jonathan Wilkes
I updated the homepage of the search plugin to point to a
pd glossary that I wrote awhile back and forgot about.

It's kind of neat-- you can add entries to the text file in
doc/5.reference/glossary.txt and doc/5.reference/glossary.pd
will parse the file, sort the entries in alphabetical order, and
display them in the patch with links to objects related to the
terms.  They probably need some work so feel free to make/
suggest changes.


Unfortunately ctrl-f Find won't scroll to the relevant part
of a long patch if the match happens to be out of view.  Is there
a way to fix this?

-Jonathan


___
Pd-list@iem.at mailing list
UNSUBSCRIBE and account-management - 
http://lists.puredata.info/listinfo/pd-list


Re: [PD] search plugin update

2011-08-28 Thread Hans-Christoph Steiner


0 0 is problematic on couple platforms.  On Mac OS X, the menubar is  
always there, so it puts the window header behind on menubar.  A  
similar problem happens on GNOME.


.hc

On Aug 25, 2011, at 5:33 PM, Jonathan Wilkes wrote:

Ok, fixed the weird resizing issue when the text in the status area  
is larger than the window.


Fixed search window to appear at 0 0 on when it's first created.


Fixed font sizing bindings.

Fixed minimum font size.

-Jonathan



- Original Message -

From: Hans-Christoph Steiner h...@at.or.at
To: Mathieu Bouchard ma...@artengine.ca
Cc: Jonathan Wilkes jancs...@yahoo.com; pd-list List pd-list@iem.at 


Sent: Sunday, August 7, 2011 5:38 PM
Subject: Re: [PD] search plugin update


On Aug 7, 2011, at 2:51 PM, Mathieu Bouchard wrote:


On Sat, 6 Aug 2011, Hans-Christoph Steiner wrote:


- on Mac OS X Cmd-Shift-= (i.e. Cmd-+) is the standard key for

increasing the size of the text.  Currently, its Cmd-=.


It will break on keyboard layouts that are not QWERTY or that are  
heavily

modified QWERTY.


When I designed some things in the default DD keyboard bindings, I  
only had

US keyboard and CF-family keyboards in mind (french QWERTY used in

Québec) and then someone notified me that I couldn't distinguish

Alt+Shift+1 from Alt+1 because 1 is already shifted in AZERTY (it's
Shift-, whereas  is not shifted).


German QWERTZ has = on Shift+0 and * on Shift++, meaning + is  
unshifted ;
however, Swiss QWERTZ has + shifted as Shift+1, and then there are  
other QWERTZ

than that...


It'd be something to test, Cmd-+ might work as a keybinding, and  
would then
work on other keyboards.  Or perhaps you can just bind to both Cmd- 
Shift-+ and
Cmd-+.  For other platforms, its not a big deal since the  
keybindings are not
very consistent.  On Mac OS X, they are quite consistent across OS  
and apps, so

people notice wrong bindings a lot more.

.hc



“We must become the change we want to see. - Mahatma Gandhi

search-plugin.tcl







All mankind is of one author, and is one volume; when one man dies,  
one chapter is not torn out of the book, but translated into a better  
language; and every chapter must be so translated -John Donne




___
Pd-list@iem.at mailing list
UNSUBSCRIBE and account-management - 
http://lists.puredata.info/listinfo/pd-list


Re: [PD] search plugin update

2011-08-25 Thread Jonathan Wilkes
Ok, fixed the weird resizing issue when the text in the status area is larger 
than the window.

Fixed search window to appear at 0 0 on when it's first created.


Fixed font sizing bindings.

Fixed minimum font size.

-Jonathan



- Original Message -
 From: Hans-Christoph Steiner h...@at.or.at
 To: Mathieu Bouchard ma...@artengine.ca
 Cc: Jonathan Wilkes jancs...@yahoo.com; pd-list List pd-list@iem.at
 Sent: Sunday, August 7, 2011 5:38 PM
 Subject: Re: [PD] search plugin update
 
 
 On Aug 7, 2011, at 2:51 PM, Mathieu Bouchard wrote:
 
  On Sat, 6 Aug 2011, Hans-Christoph Steiner wrote:
 
  - on Mac OS X Cmd-Shift-= (i.e. Cmd-+) is the standard key for 
 increasing the size of the text.  Currently, its Cmd-=.
 
  It will break on keyboard layouts that are not QWERTY or that are heavily 
 modified QWERTY.
 
  When I designed some things in the default DD keyboard bindings, I only had 
 US keyboard and CF-family keyboards in mind (french QWERTY used in
  Québec) and then someone notified me that I couldn't distinguish 
 Alt+Shift+1 from Alt+1 because 1 is already shifted in AZERTY (it's 
 Shift-, whereas  is not shifted).
 
  German QWERTZ has = on Shift+0 and * on Shift++, meaning + is unshifted ; 
 however, Swiss QWERTZ has + shifted as Shift+1, and then there are other 
 QWERTZ 
 than that...
 
 
 It'd be something to test, Cmd-+ might work as a keybinding, and would then 
 work on other keyboards.  Or perhaps you can just bind to both Cmd-Shift-+ 
 and 
 Cmd-+.  For other platforms, its not a big deal since the keybindings are not 
 very consistent.  On Mac OS X, they are quite consistent across OS and apps, 
 so 
 people notice wrong bindings a lot more.
 
 .hc
 
 
 
 “We must become the change we want to see. - Mahatma Gandhi
# plugin to allow searching all the documentation using a regexp
# check the Help menu for the Search item to use it

# Bugs:
# tiny text in combobox dropdown menu on Windows
# can't interrupt long searches on Windows (never get them in Fedora 15)
# Todo:
# try to clean up user input prevent regex error messages

package require Tk 8.5
package require pd_bindings
package require pd_menucommands

namespace eval ::dialog_search:: {
variable searchtext {}
variable search_history {}
variable count {}
variable genres [list [_ All documents] \
			[_ Object Help Patches] \
			[_ All About Pd] \
			[_ Tutorials] \
			[_ Manual] \
			[_ Uncategorized]
		]
}

# TODO check line formatting options

# find_doc_files
# basedir - the directory to start looking in
proc ::dialog_search::find_doc_files { basedir } {
# Fix the directory name, this ensures the directory name is in the
# native format for the platform and contains a final directory seperator
set basedir [string trimright [file join $basedir { }]]
set fileList {}

# Look in the current directory for matching files, -type {f r}
# means ony readable normal files are looked at, -nocomplain stops
# an error being thrown if the returned list is empty
foreach fileName [glob -nocomplain -type {f r} -path $basedir $helpbrowser::doctypes] {
lappend fileList $fileName
}

# Now look for any sub direcories in the current directory
foreach dirName [glob -nocomplain -type {d  r} -path $basedir *] {
# Recusively call the routine on the sub directory
	# (if it's not already in Pd's search path) and
	# append any new files to the results
	set nomatch [lsearch [concat [file join $::sys_libdir doc] $::sys_searchpath $::sys_staticpath] $dirName]
	if { $nomatch eq -1 } {
set subDirList [find_doc_files $dirName]
if { [llength $subDirList]  0 } {
foreach subDirFile $subDirList {
lappend fileList $subDirFile
}
}
	}
}
return $fileList
}

# TODO: break up into: l
proc ::dialog_search::open_file { xpos ypos mytoplevel clicked } {
set textwidget $mytoplevel.resultstext
set i [$textwidget index @$xpos,$ypos]
set range [$textwidget tag nextrange filename $i]
set filename [eval $textwidget get $range]
set range [$textwidget tag nextrange basedir $i]
set basedir [eval $textwidget get $range]
append basedir /
if {$clicked eq 1} {
if {$filename ne } {
	menu_doc_open $basedir $filename
}
} else {
$mytoplevel.statusbar configure -text Open $basedir$filename
}
}

# only does keywords for now-- maybe expand this to handle any meta tags
proc ::dialog_search::grab_metavalue { xpos ypos mytoplevel clicked } {
set textwidget $mytoplevel.resultstext
#set xpos_offset 20 
#set xpos [expr {$xpos + $xpos_offset}]
set i [$textwidget index @$xpos,$ypos]
set range [$textwidget tag nextrange metavalue_h $i]
set value [eval $textwidget get $range]
set text {keywords.*}
append text $value
if {$clicked eq 1} {
set

Re: [PD] search plugin update

2011-08-07 Thread Mathieu Bouchard

On Sat, 6 Aug 2011, Hans-Christoph Steiner wrote:

- on Mac OS X Cmd-Shift-= (i.e. Cmd-+) is the standard key for 
increasing the size of the text.  Currently, its Cmd-=.


It will break on keyboard layouts that are not QWERTY or that are heavily 
modified QWERTY.


When I designed some things in the default DD keyboard bindings, I only 
had US keyboard and CF-family keyboards in mind (french QWERTY used in
Québec) and then someone notified me that I couldn't distinguish 
Alt+Shift+1 from Alt+1 because 1 is already shifted in AZERTY (it's 
Shift-, whereas  is not shifted).


German QWERTZ has = on Shift+0 and * on Shift++, meaning + is unshifted ; 
however, Swiss QWERTZ has + shifted as Shift+1, and then there are other 
QWERTZ than that...


 ___
| Mathieu Bouchard  tél: +1.514.383.3801  Villeray, Montréal, QC
___
Pd-list@iem.at mailing list
UNSUBSCRIBE and account-management - 
http://lists.puredata.info/listinfo/pd-list


Re: [PD] search plugin update

2011-08-07 Thread Hans-Christoph Steiner


On Aug 7, 2011, at 2:51 PM, Mathieu Bouchard wrote:


On Sat, 6 Aug 2011, Hans-Christoph Steiner wrote:

- on Mac OS X Cmd-Shift-= (i.e. Cmd-+) is the standard key for  
increasing the size of the text.  Currently, its Cmd-=.


It will break on keyboard layouts that are not QWERTY or that are  
heavily modified QWERTY.


When I designed some things in the default DD keyboard bindings, I  
only had US keyboard and CF-family keyboards in mind (french QWERTY  
used in
Québec) and then someone notified me that I couldn't distinguish Alt 
+Shift+1 from Alt+1 because 1 is already shifted in AZERTY (it's  
Shift-, whereas  is not shifted).


German QWERTZ has = on Shift+0 and * on Shift++, meaning + is  
unshifted ; however, Swiss QWERTZ has + shifted as Shift+1, and then  
there are other QWERTZ than that...



It'd be something to test, Cmd-+ might work as a keybinding, and would  
then work on other keyboards.  Or perhaps you can just bind to both  
Cmd-Shift-+ and Cmd-+.  For other platforms, its not a big deal since  
the keybindings are not very consistent.  On Mac OS X, they are quite  
consistent across OS and apps, so people notice wrong bindings a lot  
more.


.hc



“We must become the change we want to see. - Mahatma Gandhi


___
Pd-list@iem.at mailing list
UNSUBSCRIBE and account-management - 
http://lists.puredata.info/listinfo/pd-list


Re: [PD] search plugin update

2011-08-06 Thread Hans-Christoph Steiner


Its definitely getting quite polished.  Two little details on Mac OS X  
that are odd:


- on Mac OS X Cmd-Shift-= (i.e. Cmd-+) is the standard key for  
increasing the size of the text.  Currently, its Cmd-=.


- on Mac OS X, when I mouse over the object name in the search  
results, the width of the whole window jumps, because the full path  
displayed on the bottom of the window is a lot longer then the normal  
window size would display.


.hc

On Aug 4, 2011, at 5:49 PM, Jonathan Wilkes wrote:


Search plugin revision:
* added status bar shows link locations and search text (if  
searching for a keyword tag)

* quoted text works, e.g., it's a secret to everybody
* regexes seem to work, e.g., outlet.*symbol will match all  
objects that output a symbol
* pd-style word boundaries, e.g., clip~ works when whole words  
option is checked

* font +/- with ctrl-plus/ctrl-minus keys

-Jonathan
search-plugin.tcl
___
Pd-list@iem.at mailing list
UNSUBSCRIBE and account-management - 
http://lists.puredata.info/listinfo/pd-list






If you are not part of the solution, you are part of the problem.


___
Pd-list@iem.at mailing list
UNSUBSCRIBE and account-management - 
http://lists.puredata.info/listinfo/pd-list


[PD] search plugin update

2011-08-04 Thread Jonathan Wilkes
Search plugin revision:
*added status bar shows link locations and search text (if searching for a 
keyword tag)

* quoted text works, e.g., it's a secret to everybody
* regexes seem to work, e.g., outlet.*symbol will match all objects that 
output a symbol
* pd-style word boundaries, e.g., clip~ works when whole words option is 
checked
* font +/- with ctrl-plus/ctrl-minus keys

-Jonathan
# plugin to allow searching all the documentation using a regexp
# check the Help menu for the Search item to use it

# Bugs:
# tiny text in combobox dropdown menu on Windows
# can't interrupt long searches on Windows (never get them in Fedora 15)
# Todo:
# try to clean up user input prevent regex error messages

package require Tk 8.5
package require pd_bindings
package require pd_menucommands

namespace eval ::dialog_search:: {
variable searchtext {}
variable search_history {}
variable count {}
variable genres [list [_ All documents] \
			[_ Object Help Patches] \
			[_ All About Pd] \
			[_ Tutorials] \
			[_ Manual] \
			[_ Uncategorized]
		]
}

# TODO check line formatting options

# find_doc_files
# basedir - the directory to start looking in
proc ::dialog_search::find_doc_files { basedir } {
# Fix the directory name, this ensures the directory name is in the
# native format for the platform and contains a final directory seperator
set basedir [string trimright [file join $basedir { }]]
set fileList {}

# Look in the current directory for matching files, -type {f r}
# means ony readable normal files are looked at, -nocomplain stops
# an error being thrown if the returned list is empty
foreach fileName [glob -nocomplain -type {f r} -path $basedir $helpbrowser::doctypes] {
lappend fileList $fileName
}

# Now look for any sub direcories in the current directory
foreach dirName [glob -nocomplain -type {d  r} -path $basedir *] {
# Recusively call the routine on the sub directory
	# (if it's not already in Pd's search path) and
	# append any new files to the results
	set nomatch [lsearch [concat [file join $::sys_libdir doc] $::sys_searchpath $::sys_staticpath] $dirName]
	if { $nomatch eq -1 } {
set subDirList [find_doc_files $dirName]
if { [llength $subDirList]  0 } {
foreach subDirFile $subDirList {
lappend fileList $subDirFile
}
}
	}
}
return $fileList
}

# TODO: break up into: l
proc ::dialog_search::open_file { xpos ypos mytoplevel clicked } {
set textwidget $mytoplevel.resultstext
set i [$textwidget index @$xpos,$ypos]
set range [$textwidget tag nextrange filename $i]
set filename [eval $textwidget get $range]
set range [$textwidget tag nextrange basedir $i]
set basedir [eval $textwidget get $range]
append basedir /
if {$clicked eq 1} {
if {$filename ne } {
	menu_doc_open $basedir $filename
}
} else {
$mytoplevel.statusbar configure -text Open $basedir$filename
}
}

# only does keywords for now-- maybe expand this to handle any meta tags
proc ::dialog_search::grab_metavalue { xpos ypos mytoplevel clicked } {
set textwidget $mytoplevel.resultstext
#set xpos_offset 20 
#set xpos [expr {$xpos + $xpos_offset}]
set i [$textwidget index @$xpos,$ypos]
set range [$textwidget tag nextrange metavalue_h $i]
set value [eval $textwidget get $range]
set text {keywords.*}
append text $value
if {$clicked eq 1} {
set ::dialog_search::searchtext 
set ::dialog_search::searchtext $text
::dialog_search::search
} else {
$mytoplevel.statusbar configure -text $text
}
}

# show/hide results based on genre
proc ::dialog_search::filter_results { combobox text } {
variable genres
set elide {}
if { [$combobox current] eq 0 } {
	foreach genre $genres {
	$text tag configure [join $genre _] -elide off
	set tag [join $genre _]
	append tag _count
	$text tag configure $tag -elide on
	}
	set tag [join [lindex $genres 0] _]
	append tag _count
	$text tag configure $tag -elide off
} else {
	foreach genre $genres {
	if { [$combobox get] ne $genre } {
		$text tag configure [join $genre _] -elide on
		set tag [join $genre _]
		append tag _count
		$text tag configure $tag -elide on
	} else {
		$text tag configure [join $genre _] -elide off
		set tag [join $genre _]
		append tag _count
		$text tag configure $tag -elide off
	}
	}
}
$combobox selection clear
focus $text
}

proc ::dialog_search::readfile {filename} {
set fp [open $filename]
set file_contents [split [read $fp] \n]
close $fp
return $file_contents
}

proc ::dialog_search::search { } {
variable searchtext
variable search_history
if {$searchtext eq } return
if { [lsearch $search_history $searchtext] eq -1 } {
	lappend