Well OK, but sorry in advance if this bores people.

I'm working on a frontend to CONTENTdm, specifically for viewing historical newspapers. Pretty much like the LC newspapers (http://chroniclingamerica.loc.gov/), but with CDM on the backend instead of groovy XML stuff.

Tribulation: is it even possible to do a combined fulltext and date range search? I'm using the new dmwebservices interface that's been included in v.6. I'm pretty certain (after having crawled around the code for about a week) that neither dmwebservices or dmQuery (in DMSystem.php) is the problem, so the suspect becomes the black-box "Find service". It seems like when a date search is combined with fulltext, fulltext suddenly gets redefined to be somewhat less than the actual full text. I don't know exactly what it gets reduced to, but it seems like it combines title and description and maybe subject but not the actual OCR'ed "full text". Remove the date clause from the search, and everything's fine. I've tried this on our vanilla installation, the same problem. Is this a known thing? Google reveals nothing, nor does the official site. I'm pretty close to just giving up and decoupling the two in the search interface, but it seems really unsatisfying.

Triumph: in an OCR'ed collection, there will be a "words.txt" and "words2.txt" file. The coordinates for each word are stored as 1/65535ths of the width/height of the original image dimensions. The coordinates are stored in words2.txt as <term x, y, width, height>. From there you can just overlay a positioned <div> instead of relying on the composited image you get from getimage.exe (which crashes quite relibly when the image border intersects a highlight). What the difference between words.txt and words2.txt is, I don't know yet; but I've written a little script to pull pixel coordinates of terms out of words2.txt, if anyone wants.


On 11-05-27 09:21 PM, Kevin S. Clarke wrote:
I'm sure there are folks on this mailing list who use ContentDM.  You
could always post advances, trials, and tribulations here.

Kevin


On Fri, May 27, 2011 at 11:31 PM, Rod McFarland<rod.mcfarl...@ubc.ca>  wrote:
Subject tells it all really, I've found some really old wikis and a bunch of
unhelpful Powerpoints via Google. The forum on the official page seems to be
pretty much dormant. Is there an untainted forum for CONTENTdm
users/hackers/victims out there? I've pretty much given up on the OCLC
support, but I've made some advances to share, and met some roadblocks to
ask about.

If there isn't one, I could probably set something up, if there's interest.

Reply via email to