Package org.apache.lucene.analysis
SinceNot specified.
VersionNot specified.
AuthorNot specified.
| Analyzer |
An Analyzer builds TokenStreams, which analyze text. |
| CharTokenizer |
An abstract base class for simple, character-oriented tokenizers. |
| ISOLatin1AccentFilter |
A filter that replaces accented characters in the ISO Latin 1 character set
(ISO-8859-1) by their unaccented equivalent. |
| KeywordAnalyzer |
"Tokenizes" the entire stream as a single token. |
| KeywordTokenizer |
Emits the entire input as a single token. |
| LengthFilter |
Removes words that are too long and too short from the stream. |
| LetterTokenizer |
A LetterTokenizer is a tokenizer that divides text at non-letters. |
| LowerCaseFilter |
Normalizes token text to lower case. |
| LowerCaseTokenizer |
LowerCaseTokenizer performs the function of LetterTokenizer
and LowerCaseFilter together. |
| PerFieldAnalyzerWrapper |
This analyzer is used to facilitate scenarios where different
fields require different analysis techniques. |
| PorterStemFilter |
Transforms the token stream as per the Porter stemming algorithm. |
| SimpleAnalyzer |
An Analyzer that filters LetterTokenizer with LowerCaseFilter. |
| StopAnalyzer |
Filters LetterTokenizer with LowerCaseFilter and StopFilter. |
| StopFilter |
Removes stop words from a token stream. |
| Token |
A Token is an occurence of a term from the text of a field. |
| TokenFilter |
A TokenFilter is a TokenStream whose input is another token stream. |
| Tokenizer |
A Tokenizer is a TokenStream whose input is a Reader. |
| TokenStream |
A TokenStream enumerates the sequence of tokens, either from
fields of a document or from query text. |
| WhitespaceAnalyzer |
An Analyzer that uses WhitespaceTokenizer. |
| WhitespaceTokenizer |
A WhitespaceTokenizer is a tokenizer that divides text at whitespace. |
| WordlistLoader |
Loader for text files that represent a list of stopwords. |
API and code to convert text into indexable tokens.