Word Frequency Counter

Count the frequency of words in unicode and ascii text files

Define word separators

Use tick boxes to add and remove commonly used word separators.

Add word separators

Define the characters to use as word separators.

Word separators

A word separator is any character which is used to distinguish between words for word frequency counting. WordFrequencyCounter always considers a space to be a separator, but also allows you to define other characters to be used in addition.

The default setting of WordFrequencyCounter uses ,.?:;()![]{}()#$£% and & as separators. These can easily be added to or removed as word separators as required using tick boxes, and will effect the frequencies of different words.

Some characters are particularly important as word separators.

The & sign is used in e-mail addresses. If you use this as a word separator the address will be split into two separate words - which is unlikely to be the desired result.

The -, or hyphen character is frequently and inconsistently used x with even dictionary definitions giving hyphenated and non-hyphenated versions. E.g. railroad or rail-road. If you use - as a word separator then rail-road will be split into two separate words.

Separators in languages other than English

The most common separators (apart from the space character) are punctuation. WordFrequencyCounter allow you to define additional characters as separators. This is useful for if you are counting the words of languages other than language which may use different or additional characters for punctuation. For example, if you are analysing Spanish text which uses ¿ and ¡ in addition to ? and !, you would need to add these characters as separators.