<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="text/xsl" href="/xslt/rss2.xsl" media="screen"?>
<rss version="2.0"
     xmlns:dc="http://purl.org/dc/elements/1.1/"
     xmlns:content="http://purl.org/rss/1.0/modules/content/">
	<channel>
		<title>Latest from 飞腿站长站's Sphinx</title>
		<link>http://www.feitui.com/go/sphinx</link>
		<description>这是一个站长建站交流和共享的地方</description>
		<category>Technology</category>
		<language>zh_cn</language>
		<item>
			<title>Sphinx中文分词软件包LibMMSeg ... no reply</title>
			<link>http://www.feitui.com/topic/view/293.html</link>
			<comments>http://www.feitui.com/topic/view/293.html#reply</comments>
			<dc:creator>蓝色梦幻</dc:creator>
			<author>蓝色梦幻</author>
			<!--<enclosure url="http://www.feitui.com/img/p/1.jpg" type="image/jpeg" />
			<enclosure url="http://www.feitui.com/img/p/1_s.jpg" type="image/jpeg" />
			<enclosure url="http://www.feitui.com/img/p/1_n.jpg" type="image/jpeg" />-->
			<category>Sphinx</category>
			<description>
			&lt;a href="http://www.coreseek.cn/opensource/mmseg/" rel="nofollow external" class="tpc"&gt;http://www.coreseek.cn/opensource/mmseg/&lt;/a&gt;&lt;br /&gt;
LibMMSeg 是Coreseek.com为 Sphinx 全文搜索引擎设计的中文分词软件包，其在GPL协议下发行的中文分词法，采用Chih-Hao Tsai的MMSEG算法。
			</description>
			<pubDate>Wed, 11 Jun 2008 21:04:44 +0800</pubDate>
			<guid>http://www.feitui.com/topic/view/293.html</guid>
		</item>
		<item>
			<title>用 PHP 构建自定义搜索引擎 ... no reply</title>
			<link>http://www.feitui.com/topic/view/257.html</link>
			<comments>http://www.feitui.com/topic/view/257.html#reply</comments>
			<dc:creator>蓝色梦幻</dc:creator>
			<author>蓝色梦幻</author>
			<!--<enclosure url="http://www.feitui.com/img/p/1.jpg" type="image/jpeg" />
			<enclosure url="http://www.feitui.com/img/p/1_s.jpg" type="image/jpeg" />
			<enclosure url="http://www.feitui.com/img/p/1_n.jpg" type="image/jpeg" />-->
			<category>Sphinx</category>
			<description>
			&lt;a href="http://www-128.ibm.com/developerworks/cn/opensource/os-php-sphinxsearch/" rel="nofollow external" class="tpc"&gt;http://www-128.ibm.com/developerworks/cn/opensource/os-php-sphinxsearch/&lt;/a&gt;
			</description>
			<pubDate>Sun, 27 Apr 2008 20:57:48 +0800</pubDate>
			<guid>http://www.feitui.com/topic/view/257.html</guid>
		</item>
		<item>
			<title>sphinx中文全文检索的实现 ... no reply</title>
			<link>http://www.feitui.com/topic/view/254.html</link>
			<comments>http://www.feitui.com/topic/view/254.html#reply</comments>
			<dc:creator>蓝色梦幻</dc:creator>
			<author>蓝色梦幻</author>
			<!--<enclosure url="http://www.feitui.com/img/p/1.jpg" type="image/jpeg" />
			<enclosure url="http://www.feitui.com/img/p/1_s.jpg" type="image/jpeg" />
			<enclosure url="http://www.feitui.com/img/p/1_n.jpg" type="image/jpeg" />-->
			<category>Sphinx</category>
			<description>
			首先以下是配置文件：&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
source joomlasrc&lt;br /&gt;
{&lt;br /&gt;
        type                                    = mysql&lt;br /&gt;
        sql_host                                = localhost&lt;br /&gt;
        sql_user                                = root&lt;br /&gt;
        sql_pass                                = passwod&lt;br /&gt;
        sql_db                                  = database&lt;br /&gt;
        sql_query_pre=  SET NAMES utf8&lt;br /&gt;
        sql_query                               = \&lt;br /&gt;
                SELECT id, title, introtext,unix_timestamp(created) as addtime \&lt;br /&gt;
                FROM jos_content&lt;br /&gt;
        sql_attr_timestamp              = addtime&lt;br /&gt;
        sql_ranged_throttle     = 0&lt;br /&gt;
}&lt;br /&gt;
index joomlainx&lt;br /&gt;
{&lt;br /&gt;
        source                  = joomlasrc&lt;br /&gt;
        path                    = /usr/local/sphinx/var/data/joomlainx&lt;br /&gt;
        docinfo                 = extern&lt;br /&gt;
        mlock                   = 0&lt;br /&gt;
        stopwords   =&lt;br /&gt;
        min_prefix_len  = 0&lt;br /&gt;
        min_infix_len  = 0&lt;br /&gt;
        min_word_len = 2&lt;br /&gt;
        charset_type = utf-8&lt;br /&gt;
        charset_table = U+FF10..U+FF19-&amp;gt;0..9, 0..9, U+FF41..U+FF5A-&amp;gt;a..z, U+FF21..U+FF3A-&amp;gt;a..z,\&lt;br /&gt;
        A..Z-&amp;gt;a..z, a..z, U+0149, U+017F, U+0138, U+00DF, U+00FF, U+00C0..U+00D6-&amp;gt;U+00E0..U+00F6,\&lt;br /&gt;
        U+00E0..U+00F6, U+00D8..U+00DE-&amp;gt;U+00F8..U+00FE, U+00F8..U+00FE, U+0100-&amp;gt;U+0101, U+0101,\&lt;br /&gt;
        U+0102-&amp;gt;U+0103, U+0103, U+0104-&amp;gt;U+0105, U+0105, U+0106-&amp;gt;U+0107, U+0107, U+0108-&amp;gt;U+0109,\&lt;br /&gt;
        U+0109, U+010A-&amp;gt;U+010B, U+010B, U+010C-&amp;gt;U+010D, U+010D, U+010E-&amp;gt;U+010F, U+010F,\&lt;br /&gt;
        U+0110-&amp;gt;U+0111, U+0111, U+0112-&amp;gt;U+0113, U+0113, U+0114-&amp;gt;U+0115, U+0115, U+0116-&amp;gt;U+0117,\&lt;br /&gt;
        U+0117, U+0118-&amp;gt;U+0119, U+0119, U+011A-&amp;gt;U+011B, U+011B, U+011C-&amp;gt;U+011D, U+011D,\&lt;br /&gt;
        U+011E-&amp;gt;U+011F, U+011F, U+0130-&amp;gt;U+0131, U+0131, U+0132-&amp;gt;U+0133, U+0133, U+0134-&amp;gt;U+0135,\&lt;br /&gt;
        U+0135, U+0136-&amp;gt;U+0137, U+0137, U+0139-&amp;gt;U+013A, U+013A, U+013B-&amp;gt;U+013C, U+013C,\&lt;br /&gt;
        U+013D-&amp;gt;U+013E, U+013E, U+013F-&amp;gt;U+0140, U+0140, U+0141-&amp;gt;U+0142, U+0142, U+0143-&amp;gt;U+0144,\&lt;br /&gt;
        U+0144, U+0145-&amp;gt;U+0146, U+0146, U+0147-&amp;gt;U+0148, U+0148, U+014A-&amp;gt;U+014B, U+014B,\&lt;br /&gt;
        U+014C-&amp;gt;U+014D, U+014D, U+014E-&amp;gt;U+014F, U+014F, U+0150-&amp;gt;U+0151, U+0151, U+0152-&amp;gt;U+0153,\&lt;br /&gt;
        U+0153, U+0154-&amp;gt;U+0155, U+0155, U+0156-&amp;gt;U+0157, U+0157, U+0158-&amp;gt;U+0159, U+0159,\&lt;br /&gt;
        U+015A-&amp;gt;U+015B, U+015B, U+015C-&amp;gt;U+015D, U+015D, U+015E-&amp;gt;U+015F, U+015F, U+0160-&amp;gt;U+0161,\&lt;br /&gt;
        U+0161, U+0162-&amp;gt;U+0163, U+0163, U+0164-&amp;gt;U+0165, U+0165, U+0166-&amp;gt;U+0167, U+0167,\&lt;br /&gt;
        U+0168-&amp;gt;U+0169, U+0169, U+016A-&amp;gt;U+016B, U+016B, U+016C-&amp;gt;U+016D, U+016D, U+016E-&amp;gt;U+016F,\&lt;br /&gt;
        U+016F, U+0170-&amp;gt;U+0171, U+0171, U+0172-&amp;gt;U+0173, U+0173, U+0174-&amp;gt;U+0175, U+0175,\&lt;br /&gt;
        U+0176-&amp;gt;U+0177, U+0177, U+0178-&amp;gt;U+00FF, U+00FF, U+0179-&amp;gt;U+017A, U+017A, U+017B-&amp;gt;U+017C,\&lt;br /&gt;
        U+017C, U+017D-&amp;gt;U+017E, U+017E, U+0410..U+042F-&amp;gt;U+0430..U+044F, U+0430..U+044F,\&lt;br /&gt;
        U+05D0..U+05EA, U+0531..U+0556-&amp;gt;U+0561..U+0586, U+0561..U+0587, U+0621..U+063A, U+01B9,\&lt;br /&gt;
        U+01BF, U+0640..U+064A, U+0660..U+0669, U+066E, U+066F, U+0671..U+06D3, U+06F0..U+06FF,\&lt;br /&gt;
        U+0904..U+0939, U+0958..U+095F, U+0960..U+0963, U+0966..U+096F, U+097B..U+097F,\&lt;br /&gt;
        U+0985..U+09B9, U+09CE, U+09DC..U+09E3, U+09E6..U+09EF, U+0A05..U+0A39, U+0A59..U+0A5E,\&lt;br /&gt;
        U+0A66..U+0A6F, U+0A85..U+0AB9, U+0AE0..U+0AE3, U+0AE6..U+0AEF, U+0B05..U+0B39,\&lt;br /&gt;
        U+0B5C..U+0B61, U+0B66..U+0B6F, U+0B71, U+0B85..U+0BB9, U+0BE6..U+0BF2, U+0C05..U+0C39,\&lt;br /&gt;
        U+0C66..U+0C6F, U+0C85..U+0CB9, U+0CDE..U+0CE3, U+0CE6..U+0CEF, U+0D05..U+0D39, U+0D60,\&lt;br /&gt;
        U+0D61, U+0D66..U+0D6F, U+0D85..U+0DC6, U+1900..U+1938, U+1946..U+194F, U+A800..U+A805,\&lt;br /&gt;
        U+A807..U+A822, U+0386-&amp;gt;U+03B1, U+03AC-&amp;gt;U+03B1, U+0388-&amp;gt;U+03B5, U+03AD-&amp;gt;U+03B5,\&lt;br /&gt;
        U+0389-&amp;gt;U+03B7, U+03AE-&amp;gt;U+03B7, U+038A-&amp;gt;U+03B9, U+0390-&amp;gt;U+03B9, U+03AA-&amp;gt;U+03B9,\&lt;br /&gt;
        U+03AF-&amp;gt;U+03B9, U+03CA-&amp;gt;U+03B9, U+038C-&amp;gt;U+03BF, U+03CC-&amp;gt;U+03BF, U+038E-&amp;gt;U+03C5,\&lt;br /&gt;
        U+03AB-&amp;gt;U+03C5, U+03B0-&amp;gt;U+03C5, U+03CB-&amp;gt;U+03C5, U+03CD-&amp;gt;U+03C5, U+038F-&amp;gt;U+03C9,\&lt;br /&gt;
        U+03CE-&amp;gt;U+03C9, U+03C2-&amp;gt;U+03C3, U+0391..U+03A1-&amp;gt;U+03B1..U+03C1,\&lt;br /&gt;
        U+03A3..U+03A9-&amp;gt;U+03C3..U+03C9, U+03B1..U+03C1, U+03C3..U+03C9, U+0E01..U+0E2E,\&lt;br /&gt;
        U+0E30..U+0E3A, U+0E40..U+0E45, U+0E47, U+0E50..U+0E59, U+A000..U+A48F, U+4E00..U+9FBF,\&lt;br /&gt;
        U+3400..U+4DBF, U+20000..U+2A6DF, U+F900..U+FAFF, U+2F800..U+2FA1F, U+2E80..U+2EFF,\&lt;br /&gt;
        U+2F00..U+2FDF, U+3100..U+312F, U+31A0..U+31BF, U+3040..U+309F, U+30A0..U+30FF,\&lt;br /&gt;
        U+31F0..U+31FF, U+AC00..U+D7AF, U+1100..U+11FF, U+3130..U+318F, U+A000..U+A48F,\&lt;br /&gt;
        U+A490..U+A4CF&lt;br /&gt;
        ngram_len = 1&lt;br /&gt;
        ngram_chars = U+4E00..U+9FBF, U+3400..U+4DBF, U+20000..U+2A6DF, U+F900..U+FAFF,\&lt;br /&gt;
        U+2F800..U+2FA1F, U+2E80..U+2EFF, U+2F00..U+2FDF, U+3100..U+312F, U+31A0..U+31BF,\&lt;br /&gt;
        U+3040..U+309F, U+30A0..U+30FF, U+31F0..U+31FF, U+AC00..U+D7AF, U+1100..U+11FF,\&lt;br /&gt;
        U+3130..U+318F, U+A000..U+A48F, U+A490..U+A4CF&lt;br /&gt;
}&lt;br /&gt;
&lt;br /&gt;
indexer&lt;br /&gt;
{&lt;br /&gt;
        mem_limit                       = 32M&lt;br /&gt;
}&lt;br /&gt;
searchd&lt;br /&gt;
{&lt;br /&gt;
        port                            = 3312&lt;br /&gt;
        log                                     = /usr/local/sphinx/var/log/searchd.log&lt;br /&gt;
        query_log                       = /usr/local/sphinx/var/log/query.log&lt;br /&gt;
        read_timeout            = 5&lt;br /&gt;
        max_children            = 30&lt;br /&gt;
        pid_file                        = /usr/local/sphinx/var/log/searchd.pid&lt;br /&gt;
        max_matches                     = 1000&lt;br /&gt;
        seamless_rotate         = 1&lt;br /&gt;
}&lt;br /&gt;
&lt;br /&gt;
你可以参考一下，使用这个配置文件需要修改用户，密码保存目录等，来符合你自己mysql的要求。sphinx的官网的配置有一些问题，最关键的就是 morphology              = none 这段在中文情况下不能使用，否则会导致无法启动searchd服务。&lt;br /&gt;
&lt;br /&gt;
配置完成后，就可以进行索引：&lt;br /&gt;
&lt;br /&gt;
/usr/local/sphinx/bin/indexer --config /usr/local/sphinx/etc/sphinx.conf --all&lt;br /&gt;
&lt;br /&gt;
然后可以启动searchd&lt;br /&gt;
&lt;br /&gt;
/usr/local/sphinx/bin/searchd  --config /usr/local/sphinx/etc/sphinx.conf&lt;br /&gt;
&lt;br /&gt;
 我们来用php api进行搜索，将安装目录的api文件都copy 到 网站的documentroot下，最重要的是sphinxapi.php&lt;br /&gt;
&lt;br /&gt;
新建一个test.php文件，代码如下：&lt;br /&gt;
&lt;br /&gt;
&amp;lt;?php&lt;br /&gt;
require ( "sphinxapi.php" );&lt;br /&gt;
$index = 'joomlainx';&lt;br /&gt;
$q = '模块';&lt;br /&gt;
$start = 0;&lt;br /&gt;
$limit = 10;&lt;br /&gt;
$cl = new SphinxClient ();&lt;br /&gt;
$cl-&amp;gt;SetMatchMode(SPH_MATCH_ALL);&lt;br /&gt;
$cl-&amp;gt;SetLimits($start,$limit);&lt;br /&gt;
$res = $cl-&amp;gt;Query ( $q, $index );&lt;br /&gt;
print_r($res);&lt;br /&gt;
?&amp;gt;&lt;br /&gt;
这是一个最简单的测试实例，注意保存的charset是 utf-8 ，无BOM，现在可以通过http://yourdomain.com/test.php来测试，结果会返回一个数组，sphinx不会返回title和content等内容，因此要根据id在mysql检索以显示文章标题和正文。&lt;br /&gt;
&lt;br /&gt;
sphinx现在生成的索引文件不能大于2G。&lt;br /&gt;
&lt;br /&gt;
我简单测试了一下，相对于zend_lunce_search，索引和检索的速度都快多了！尤其对于数据量大的情况，zend_lucene的性能就查的太多了。不过由于现在还只是一元分词，NGram_length =1 ，官网上说正在进行这部分工作，如果这部分完成后，应该检索的命中就更精确了！
			</description>
			<pubDate>Thu, 24 Apr 2008 21:51:26 +0800</pubDate>
			<guid>http://www.feitui.com/topic/view/254.html</guid>
		</item>
		<item>
			<title>Mysql全文搜索引擎sphinx安装手册 ... no reply</title>
			<link>http://www.feitui.com/topic/view/253.html</link>
			<comments>http://www.feitui.com/topic/view/253.html#reply</comments>
			<dc:creator>蓝色梦幻</dc:creator>
			<author>蓝色梦幻</author>
			<!--<enclosure url="http://www.feitui.com/img/p/1.jpg" type="image/jpeg" />
			<enclosure url="http://www.feitui.com/img/p/1_s.jpg" type="image/jpeg" />
			<enclosure url="http://www.feitui.com/img/p/1_n.jpg" type="image/jpeg" />-->
			<category>Sphinx</category>
			<description>
			本文简单记录一下我的安装和测试过程，mysql已经安装完毕，路径是/usr/local/mysql ，操作系统是LinuxAS4U4。&lt;br /&gt;
&lt;br /&gt;
 wget &lt;a href="http://www.sphinxsearch.com/downloads/sphinx-0.9.8-rc2.tar.gz" rel="nofollow external" class="tpc"&gt;http://www.sphinxsearch.com/downloads/sphinx-0.9.8-rc2.tar.gz&lt;/a&gt;&lt;br /&gt;
 tar -xvzf sphinx-0.9.8-rc2.tar.gz&lt;br /&gt;
 cd sphinx-0.9.8-rc2&lt;br /&gt;
 ./configure --prefix=/usr/local/sphinx --with-mysql=/usr/local/mysql&lt;br /&gt;
 make&lt;br /&gt;
 make install&lt;br /&gt;
 cp sphinx.conf.dist sphinx.conf&lt;br /&gt;
 vi sphinx.conf&lt;br /&gt;
 修改配置文件中的用户名和访问密码&lt;br /&gt;
 mysql -uroot -p &amp;lt; /usr/local/sphinx/etc/example.sql&lt;br /&gt;
&lt;br /&gt;
 vi /etc/ld.so.conf.d/mysqlclient15.conf&lt;br /&gt;
 增加一行记录 /usr/local/mysql/lib&lt;br /&gt;
 ldconfig&lt;br /&gt;
 /usr/local/sphinx/bin/indexer --config=/usr/local/sphinx/etc/sphinx.conf --all&lt;br /&gt;
&lt;br /&gt;
 正常情况下，可以看到索引完成。&lt;br /&gt;
 /usr/local/sphinx/bin/search test &lt;br /&gt;
 可以看到查询结果
			</description>
			<pubDate>Thu, 24 Apr 2008 21:49:37 +0800</pubDate>
			<guid>http://www.feitui.com/topic/view/253.html</guid>
		</item>
		<item>
			<title>sphinx配置文件说明 ... no reply</title>
			<link>http://www.feitui.com/topic/view/252.html</link>
			<comments>http://www.feitui.com/topic/view/252.html#reply</comments>
			<dc:creator>蓝色梦幻</dc:creator>
			<author>蓝色梦幻</author>
			<!--<enclosure url="http://www.feitui.com/img/p/1.jpg" type="image/jpeg" />
			<enclosure url="http://www.feitui.com/img/p/1_s.jpg" type="image/jpeg" />
			<enclosure url="http://www.feitui.com/img/p/1_n.jpg" type="image/jpeg" />-->
			<category>Sphinx</category>
			<description>
			sphinx是以sphinx.conf为配置文件，索引与搜索均以这个文件为依据进行，要进行全文检索，首先就要配置好sphinx.conf，告诉sphinx哪些字段需要进行索引，哪些字段需要在where,orderby,groupby中用到。&lt;br /&gt;
&lt;br /&gt;
sphinx.conf的内容组成&lt;br /&gt;
&lt;br /&gt;
source 源名称1{&lt;br /&gt;
…&lt;br /&gt;
}&lt;br /&gt;
index 索引名称1{&lt;br /&gt;
source=源名称1&lt;br /&gt;
…&lt;br /&gt;
}&lt;br /&gt;
source 源名称2{&lt;br /&gt;
…&lt;br /&gt;
}&lt;br /&gt;
index 索引名称2{&lt;br /&gt;
source = 源名称2&lt;br /&gt;
…&lt;br /&gt;
}&lt;br /&gt;
indexer{&lt;br /&gt;
…&lt;br /&gt;
}&lt;br /&gt;
searchd{&lt;br /&gt;
…&lt;br /&gt;
}&lt;br /&gt;
&lt;br /&gt;
Source部分配置项说明&lt;br /&gt;
&lt;br /&gt;
#type 数据库类型&lt;br /&gt;
#strip_html 是否去掉html标签&lt;br /&gt;
#sql_host 数据库主机地址&lt;br /&gt;
#sql_user 数据库用户名&lt;br /&gt;
#sql_pass 数据库密码&lt;br /&gt;
#sql_db   数据库名称&lt;br /&gt;
#sql_port 数据库采用的端口&lt;br /&gt;
#sql_query_pre 执行sql前要设置的字符集，用utf8必须SET NAMES utf8&lt;br /&gt;
#sql_query  全文检索要显示的内容index部分配置项说明&lt;br /&gt;
&lt;br /&gt;
#source 数据源名&lt;br /&gt;
#path   索引记录存放目录&lt;br /&gt;
#如果检索的不是中文，则charset_table,ngrams_chars,min_word_len就要设置不同的内容
			</description>
			<pubDate>Thu, 24 Apr 2008 21:48:35 +0800</pubDate>
			<guid>http://www.feitui.com/topic/view/252.html</guid>
		</item>
		<item>
			<title>Sphinx中文手册 ... no reply</title>
			<link>http://www.feitui.com/topic/view/207.html</link>
			<comments>http://www.feitui.com/topic/view/207.html#reply</comments>
			<dc:creator>蓝色梦幻</dc:creator>
			<author>蓝色梦幻</author>
			<!--<enclosure url="http://www.feitui.com/img/p/1.jpg" type="image/jpeg" />
			<enclosure url="http://www.feitui.com/img/p/1_s.jpg" type="image/jpeg" />
			<enclosure url="http://www.feitui.com/img/p/1_n.jpg" type="image/jpeg" />-->
			<category>Sphinx</category>
			<description>
			&lt;a href="http://www.sphinxsearch.com/wiki/doku.php?id=sphinx_chinese_tutorial" rel="nofollow external" class="tpc"&gt;http://www.sphinxsearch.com/wiki/doku.php?id=sphinx_chinese_tutorial&lt;/a&gt;
			</description>
			<pubDate>Mon, 24 Mar 2008 21:47:18 +0800</pubDate>
			<guid>http://www.feitui.com/topic/view/207.html</guid>
		</item>
	</channel>
</rss>